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NOVEL HUMAN GENES AND GENE EXPRESSION PRODUCTS 

FIELD OF THE INVENTION 

The present invention relates to novel polynucleotides of human origin 
and the encoded gene products. 

5 BACKGROUND OF THE INVENTION 

Identification of novel polynucleotides, particularly those that encode an 
expressed gene product, is important in the advancement of drug discovery, diagnostic 
technologies, and the understanding of the progression and nature of complex diseases 
such as cancer. Identification of genes expressed in different cell types isolated from 
10 sources that differ in disease state or stage, developmental stage, exposure to various 
environmental factors, the tissue of origin, the species from which the tissue was 
isolated, and the like is key to identifying the genetic factors that are responsible for the 
phenotypes associated with these various differences. 

This invention provides novel human polynucleotides, the polypeptides 
1 5 encoded by these polynucleotides, and the genes and proteins corresponding to these 
novel polynucleotides. 



SUMMARY OF THE INVENTION 

This invention relates to novel human polynucleotides and variants 
thereof, their encoded polypeptides and variants thereof, to genes corresponding to these 

20 polynucleotides and to proteins expressed by the genes. The invention also relates to 
diagnostics and therapeutics comprising such novel human polynucleotides, their 
corresponding genes or gene products, including probes, antisense nucleotides, and 
antibodies. The polynucleotides of the invention correspond to a polynucleotide 
comprising the sequence information of at least one of SEQ ID NOs: 1-3351 . 

25 Various aspects and embodiments of the invention will be readily 

apparent to the ordinarily skilled artisan upon reading the description provided herein. 

DETAILED DESCRIPTION OF THE INVENTION 

The invention relates to polynucleotides comprising the disclosed 
nucleotide sequences, to full length cDNA, mRNA genomic sequences, and genes 
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corresponding to these sequences and degenerate variants thereof, and to polypeptides 
encoded by the polynucleotides of the invention and polypeptide variants. 

Polypeptide variants differ from wild type protein in having one or more 
amino acid substitutions that either enhance, add, or diminish a biological activity of the 
wild type protein. 

Six of the polypeptides disclosed herein encode new members of the MKK 
kinase family; the coding region is found within the nucleotide region in parentheses: SEQ 
ID NO-29 (nucleotides 295-421); SEQ ID NO:31 (298-397); SEQ ID MO:196 (37-322); 
SEQ ID NO:3175 (nucleotides 14-164); SEQ ID NO:3190 (229-390); and SEQ ID 
NO:3281 (15-182). Twenty-four of the polypeptides encode new members of the family 
of transcription factor proteins having a basic region plus leucine zipper: SEQ ID NO:410 
(42-191); SEQ ID NO:552 (1 16-288); SEQ ID NO:768 (1 16-288); SEQ ID NO:822 (108- 
262); SEQ ID NO:836 (158-353); SEQ ID NO:1288 (73-234); SEQ ID NO:1365 (69-257) 
SEQ'lD TMO:1540 (289-471); SEQ ID NO:1549 (200-391); SEQ ID NO: 1556 (163-354) 
15 SEQ ID NO:1557 (207-398); SEQ ID NO:1563 (107-298); SEQ ID NO:1622 (180-365) 
SEQ ID NO:1630 (100-291); SEQ ID NO:1704 (184-372); SEQ ID NO:1808 (36-161) 
SEQ ID NO:1454 (49-209); SEQ ID NO:2363 (48-211); SEQ ID NO:2424 (43-194) 
SEQ ID NO:3147 (190-369); SEQ ID NO:3152 (129-320); SEQ ID NO:3158 (167 
334); and SEQ ID NO:3208 (34-256). 
20 ' SEQ ID NOs:186 (175-395); 2591 (60-165); 3307 (43-321); and 3339 

(94-342) encode polypeptides having an SH2 domain, and SEQ ID NOs:234 (23-121), 
1832 (18-173), and 1835 (57-206) encode polypeptides having an SH3 domain. Nine 
polypeptides encode new members of the family of proteins having Ank repeat regions- 
SEQ ID NO:187 (358-432); SEQ ID NO:1268 (238-315); SEQ ID NO:1804 (301-378) 
25 SEQ ID NO:1819 (278-355); SEQ ID NO:1839 (224-307); SEQ ID NO:1830 (184-267) 
SEQ ID NO:2562 (18-101); SEQ ID NO:3015 (131-214); and SEQ ID NO:3267 (97- 
180). 

The following eleven polynucleotides encode polypeptides having a C2H2 
type zinc finger: SEQ ID NOs:308 (1 10-172); 807 (339-392); 1324 (294-356); 1503 (154- 
216); 1527 (156-212); 1674 (196-258); 1779 (64-126); 1801 (295-351); 3081 (190-252); 
3193 (293-355); and 3306 (161-223). Eight polynucleotides encode polypeptides of the 
family of ATPases: SEQ ID NOs:431 (71-428); 639 (157-561); 2135 (2-401); 2684 (9- 
461); 2859 (100-320); 3178 (45-386); 3197 (281-343) and 3266 (8-139). Polypeptides 
having a fibronectin type III domain are encoded by SEQ ID NO:746 (209-427) and 1 192 
35 (186-416). Polypeptides having an EF-hand domain are encoded by SEQ ID NO:820 (341- 
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406); 1755 (281-367) and 3285(16-102). Six polypeptides of the protein kinase family are 
encoded by SEQ ID NOs:l 157 (41-444); 1478 (54-437), 1496 (241-520); 2286 (12-182);. 
2969 (5-387); and 3190 (118-390). 

LIM domain-containing polypeptides are encoded by SEQ ID NO: 1269 
5 (79-240); 1309 (248-404); 1360 (222-377); and 1386 (243-398). Two polypeptides of the 
family having a C2 domain (protein kinase C-like) are encoded by SEQ ID NO: 1325 (1- 
234) and 2282(183-353). Polypeptides having a WD domain, G-beta repeat motif are 
encoded by SEQ ID NOs:1336 (66-164); 1380 (42-140); 171 1 (263-361); 1762 (236-334); 
1909 (160-258); 2218 (127-225); 3047 (191-292); 3108 (275-367) and 3292 (208-300). 
10 SEQ ID NO: 1410 (222-350) encodes a member of the trypsin family. SEQ 

ID NOs:1417 (8-354); 2281 (20-387) and 2310 (20-371) encode members of the protein 
tyrosine phosphatase family. SEQ ID NOs:1464 (4-180) and 1514 (2-252) encode 
members of the family having an RNA recognition motif (also known as RRM, RBD, or 
RNP domain). SEQ ID NOs:1496 (241-520) and 3297(7-153) encode helicases having a 
15 conserved C-terminal domain. SEQ ID NO: 1538 (9-635) encodes a member of the wnt 
family of developmental signaling proteins. 

Three polynucleotides encode polypeptides having a homeobox domain: 
SEQ ID NOs:1676 (9-86); 1820 (123-299); and 1821 (127-303). A novel thioredoxin is 
encoded by SEQ ID NO: 1677 (316-369). Two novel members of the ras family are 
20 encoded by SEQ ID NO: 1688(1 09-4 10) and 3258(138-394). A novel polypeptide having a 
phosphatidylinositol-specific phospholipase C Y-domain is encoded by SEQ ID NO: 1707 
(92-439). A novel serine carboxypeptidase is encoded by SEQ ID NO: 1744 (238-433). A 
novel polypeptide having N-terminal homology in the Ets domain is encoded by SEQ ID 
NO:181 1 (184-315). A novel polypeptide having a bromodomain is encoded by SEQ ID 
25 NO:1814 (127-294). A novel polypeptide having a double-stranded RNA binding motif is 
encoded by SEQ ID NO: 181 8 (9-146). A novel polypeptide having a G-protein alpha 
subunit is encoded by SEQ IDNO:1846 (12-398). 

SEQ ID NOs:1911 (35-151) and 1980 (60-197) encode polypeptides 
having a C3HC4 type zinc finger domain (RING finger). SEQ ID NO:2065 (253-306) 
30 encodes a polypeptide having a CCHC zinc finger domain. SEQ ID NO:22 1 6 (90- 1 79) 
encodes a polypeptide having a WW/rsp5/WWP domain. SEQ ID NO:2428 (25-350) 
encodes a polypeptide member of the dual specificity phosphatase family, having a 
catalytic domain. 

SEQ ID NOs:2577 (0-311); 3183 (14-215); and 3195 (0-215) encode 
35 members of the 4 transmembrane segment integral membrane protein family. SEQ ID 
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NOs:2826 (1 16-400) and 2871 (198-392) encode polypeptides of the DEAD and DEAH 
box helicase family. SEQ ID NO:2944 (18-281) encodes a polypeptide having a 

calpain large subunit, domain III. 

SEQ ID NO:3274 (11-187) encodes a eukaryotic transcription factor 
5 with a fork head domain. SEQ ID NO:3345 (65-271) encodes a polypeptide having a 
* PDZ domain, and SEQ ID NO:3351 (124-270) encodes a polypeptide in the family of 
phorbol esters/glycerol binding proteins. 

Described below are polynucleotide compositions encompassed by the 
invention, methods for obtaining cDNA or genomic DNA encoding a full-length gene 
10 product, expression of these polynucleotides and genes, identification of structural motifs 
of the polynucleotides and genes, identification of the function of a gene product encoded 
by a gene corresponding to a polynucleotide of the invention, use of the provided 
polynucleotides as probes and in mapping and in tissue profiling, use of the corresponding 
polypeptides and other gene products to raise .antibodies, and use of the polynucleotides 
1 5 and their encoded gene products for therapeutic and diagnostic purposes. 



20 



Polynucleotide Compositions 

The scope of the invention with respect to polynucleotide compositions 
includes, but is not necessarily limited to, polynucleotides having a sequence set forth in 
any one of SEQ ID NOs: 1-3351; polynucleotides obtained from the biological materials 
described herein or other biological sources (particularly human sources) by 
hybridization under stringent conditions (particularly conditions of high stringency); 
genes corresponding to the provided polynucleotides; variants of the provided 
polynucleotides and their corresponding genes, particularly those variants that retain a 
biological activity of the encoded gene product (e.g., a biological activity ascribed to a 
25 gene product corresponding to the provided polynucleotides as a result of the 
assignment of the gene product to a protein family(ies) and/or identification of a 
functional domain present in the gene product). Other nucleic acid compositions 
contemplated by and within the scope of the present invention will be readily apparent 
to one of ordinary skill in the art when provided with the disclosure here. 
30 "Polynucleotide" and "nucleic acid" as used herein with reference to nucleic acids of 
the composition is not intended to be limiting as to the length or structure of the nucleic 
acid unless specifically indicated. 

The invention features polynucleotides that are expressed in human 
tissue, specifically human colon, breast, and/or lung tissue. Novel nucleic acid 
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compositions of the invention comprise a sequence set forth in any one of SEQ ID 
NOs l-3351 or an identifying sequence thereof. An "identifying sequence" is a 
contiguous sequence of residues at least about 10 nt to about 20 nt in length, usually at 
least about 50 nt to about 100 nt in length, that uniquely identifies a polynucleotide 
5 sequence, e.g., exhibits less than 90%, usually less than about 80% to about 85% 
sequence identity to any contiguous nucleotide sequence of more than about 20 nt. 
Thus, the subject novel nucleic acid compositions include full length cDNAs or mRNAs 
that encompass an identifying sequence of contiguous nucleotides from any one of SEQ 
IDNOs:l-3351. 

10 The polynucleotides of the invention also include polynucleotides having 

sequence similarity or sequence identity. Nucleic acids having sequence similarity are 
detected by hybridization under low stringency conditions, for example, at 50°C and 
10XSSC (0.9 M saline/0.09 M sodium citrate) and remain bound when subjected to 
washing at 55°C in 1XSSC. Sequence identity can be determined by hybridization 

15 under stringent conditions, for example, at 50°C or higher and 0.1XSSC (9 mM 
saline/0 9 mM sodium citrate). Hybridization methods and conditions are well known 
in the art see, e.g., U.S. Patent No. 5,707,829. Nucleic acids that are substantially 
identical to the provided polynucleotide sequences, e.g., allelic variants, genetically 
altered versions of the gene, etc., bind to the provided polynucleotide sequences (SEQ 

20 ID NOs: 1-335 1 ) under stringent hybridization conditions. By using probes, particularly 
labeled probes of DNA sequences, one can isolate homologous or related genes. The 
source of homologous genes can be any species, e.g., primate species, particularly 
human; rodents, such as rats and mice; canines, felines, bovines, ovines, equines, yeast, 
nematodes, etc. 

25 Preferably, hybridization is performed using at least 15 contiguous 

nucleotides (nt) of at least one of SEQ ID NOs:l-3351. That is, when at least 15 
contiguous nt of one of the disclosed SEQ ID NOs. is used as a probe, the probe will 
preferentially hybridize with a nucleic acid comprising the complementary sequence, 
allowing the identification and retrieval of the nucleic acids that uniquely hybridize to 

30 the selected probe. Probes from more than one SEQ ID NO. can hybridize with the 
same nucleic acid if the cDNA from which they were derived corresponds to one 
mRNA. Probes of more than 15 nt can be used, e.g., probes of from about 18 nt to 
about 100 nt, but 15 nt represents sufficient sequence for unique identification. 

The polynucleotides of the invention also include naturally occurring 
35 variants of the nucleotide sequences {e.g., degenerate variants, allelic variants). 
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Variants of the polynucleotides of the invention are identified by hybridization of 
putative variants with nucleotide sequences disclosed herein, preferably by 
hybridization under stringent conditions. For example, by using appropriate wash 
conditions, variants of the polynucleotides of the invention can be identified where the 
5 allelic variant exhibits at most about 25-30% base pair (bp) mismatches relative to the 
selected polynucleotide probe. In general, allelic variants contain 15-25% bp 
mismatches, and can contain as little as even 5-15%, or 2-5%, or 1-2% bp mismatches, 

as well as a single bp mismatch. 

The invention also encompasses homologs corresponding to the 

10 polynucleotides of SEQ ID NOs:l-3351, where the source of homologous genes can be 
any mammalian species, e.g., primate species, particularly human; rodents, such as rats; 
canines, felines, bovines, ovines, equines, yeast, nematodes, etc. Between mammalian 
species, e.g., human and mouse, homologs generally have substantial sequence 
similarity, e.g., at least 75% sequence identity, usually at least 90%, more usually at 

15 least 95% between nucleotide sequences. Sequence similarity is calculated based on a 
reference sequence, which may be a subset of a larger sequence, such as a conserved 
motif, coding region, flanking region, etc. A reference sequence will usually be at least 
about 18 contiguous nt long, more usually at least about 30 nt long, and may extend to 
the complete sequence that is being compared. Algorithms for sequence analysis are 

20 known in the art, such as BLAST, described in Altschul et al., J. Mol. Biol. (1990) 
275:403-10. 

In general, variants of the invention have a sequence identity greater than 
at least about 65%, preferably at least about 75%, more preferably at least about 85%, 
and can be greater than at least about 90%, 91%, 92%, 93%, 94%, 95%, or 96%, most 
25 preferably 97%, 98% or 99%. For the purposes of this invention, a preferred method of 
calculating percent identity is the Smith- Waterman algorithm, using the following. 
Global DNA sequence identity must be greater than 65% as determined by the Smith- 
Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
- Molecular) using an affine gap search with the following search parameters: gap open 
30 penalty, 12; and gap extension penalty, 1. 

The subject nucleic acids can be cDNAs or genomic DNAs, as well as 
fragments thereof, particularly fragments that encode a biologically active gene product 
and/or are useful in the methods disclosed herein {e.g., in diagnosis, as a unique 
identifier of a differentially expressed gene of interest, etc.). The term "cDNA" as used 
35 herein is intended to include all nucleic acids that share the arrangement of sequence 
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elements found in native mature mRNA species, where sequence elements are exons 
and 3' and 5' non-coding regions. Normally mRNA species have contiguous exons, 
with the intervening introns, when present, being removed by nuclear RNA splicing, to 
create a continuous open reading frame encoding a polypeptide of the invention. 
5 A genomic sequence of interest comprises the nucleic acid present 

between the initiation codon and the stop codon, as defined in the listed sequences, 
including all of the introns that are normally present in a native chromosome. It can 
further include the 3' and 5' untranslated regions found in the mature mRNA. It can 
further include specific transcriptional and translational regulatory sequences, such as 
10 promoters, enhancers, etc., including about 1 kb, but possibly more, of flanking 
genomic DNA at either the 5' and 3' end of the transcribed region. The genomic DNA 
can be isolated as a fragment of 100 kbp or smaller; and substantially free of flanking 
chromosomal sequence. The genomic DNA flanking the coding region, either 3' and 
5', or internal regulatory sequences as sometimes found in introns, contains sequences 
1 5 required for proper tissue, stage-specific, or disease-state specific expression. 

The nucleic acid compositions of the subject invention can encode all or 
a part of the subject polypeptides. Double or single stranded fragments can be obtained 
from the DNA sequence by chemically synthesizing oligonucleotides in accordance 
with conventional methods, by restriction enzyme digestion, by PCR amplification, etc. 
20 Isolated polynucleotides and polynucleotide fragments of the invention comprise at 
least about 10, about 15, about 20, about 35, about 50, about 100, about 150 to about 
200, about 250 to about 300, or about 350 contiguous nt selected from the 
polynucleotide sequences as shown in SEQ ID NOs: 1-3351. The fragments also 
include those of lengths intermediate to the specifically mentioned lengths, such as 35, 
25 36, 37, 38, 39, etc.; 1 50, 1 5 1 , 1 52, 1 53, 1 54, etc. For the most part, fragments will be of 
at least 15 nt, usually at least 18 nt or 25 nt, and up to at least about 50 contiguous nt in 
length or more. In a preferred embodiment, the polynucleotide molecules comprise a 
contiguous sequence of at least 12 nt selected from the group consisting of the 
polynucleotides shown in SEQ ID NOs: 1 -335 1 . 
30 Probes specific to the polynucleotides of the invention can be generated 

using the polynucleotide sequences disclosed in SEQ ID NOs: 1-3351. The probes are 
preferably at least about a 12, 15, 16, 18, 20, 22, 24, or 25 nt fragment of a 
corresponding contiguous sequence of SEQ ID NOs:l-3351, and can be less than 2, 1, 
0.5, 0.1, or 0.05 kb in length. The probes can be synthesized chemically or can be 
35 generated from longer polynucleotides using restriction enzymes. The probes can be 
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labeled, for example, with a radioactive, biotinylated, or fluorescent tag. Preferably 
probes are designed based upon an identifying sequence of a polynucleotide of one of 
SEQ ID NOsl-3351. More preferably, probes are designed based on a contiguous 
sequence of one of the subject polynucleotides that remain unmasked following 
5 application of a masking program for masking low complexity (e.g. , XBLAST) to the 
sequence., one would select an unmasked region, as indicated by the 

polynucleotides outside the poly-n stretches of the masked sequence produced by the 

masking program. . . 

The polynucleotides of the subject invention are isolated and obtained in 

10 substantial purity, generally as other than an intact chromosome. Usually, the 
polynucleotides, either as DNA or RNA, will be obtained substantially free of other 
naturally-occurring nucleic acid sequences, generally being at least about 50%, usually 
at least about 90% pure and are typically "recombinant", e.g., flanked by one or more 
nucleotides with which it is not normally associated on a naturally occurring 

15 chromosome. 

The polynucleotides of the invention can be provided as a linear 
molecule or within a circular molecule, and can be provided within autonomously 
replicating molecules (vectors) or within molecules without replication sequences. 
Expression of the polynucleotides can be regulated by their own or by other regulatory 
20 sequences known in the art. The polynucleotides of the invention can be introduced 
into suitable host cells using a variety of techniques available in the art, such as 
transferrin polycation-mediated DNA transfer, transfection with naked or encapsulated 
nucleic acids, liposome-mediated DNA transfer, intracellular transportation of DNA- 
coated latex beads, protoplast fusion, viral infection, electroporation, gene gun, calcium 
25 phosphate-mediated transfection, and the like. 

The subject nucleic acid compositions can be used to, for example, 
produce polypeptides, as probes for the detection of mRNA of the invention in 
biological samples (e.g., extracts of human cells) to generate additional copies of the 
polynucleotides, to generate ribozymes or antisense oligonucleotides, and as single 
30 stranded DNA probes or as triple-strand forming oligonucleotides. The probes 
described herein can be used to, for example, determine the presence or absence of the 
polynucleotide sequences as shown in SEQ ID NOs:l-3351 or variants thereof m a 
sample. These and other uses are described in more detail below. 
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i jgg »f Polynucleotides to Obtain Full-Le ngth cDNA. Gene and Promoter Region 

Full-length cDNA molecules comprising the disclosed polynucleotides 
are obtained as follows. A polynucleotide having a sequence of one of SEQ ID NOs:l- 
3351, or a portion thereof comprising at least 12, 15, 18, or 20 nt, is used as a 
5 hybridization probe to detect hybridizing members of a cDNA library using probe 
design methods, cloning methods, and clone selection techniques such as those 
described in U.S. Patent No. 5,654,173. Libraries of cDNA are made from selected 
tissues, such as normal or tumor tissue, or from tissues of a mammal treated with, for 
example, a pharmaceutical agent. Preferably, the tissue is the same as the tissue from 
10 which the polynucleotides of the invention were isolated, as both the polynucleotides 
described herein and the cDNA represent expressed genes. Most preferably, the cDNA 
library is made from the biological material described herein in the Examples. The 
choice of cell type for library construction can be made after the identity of the protein 
encoded by the gene corresponding to the polynucleotide of the invention is known. 
15 This will indicate which tissue and cell types are likely to express the related gene, and 
thus represent a suitable source for the mRNA for generating the cDNA. As described 
in the Examples, cDNA of the invention was isolated from specific cell or tissue types, 
and such cells and tissues are preferable for obtaining related nucleic acids. 

Techniques for producing and probing nucleic acid sequence libraries are 
20 described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
2nd Ed, (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY. The cDNA can be 
prepared by using primers based on sequence from SEQ ID NOs: 1-3351. In one 
embodiment, the cDNA library can be made from only poly-adenylated mRNA. Thus, 
poly-T primers can be used to prepare cDNA from the mRNA. 
25 Members of the library that are larger than the provided polynucleotides, 

and preferably that encompass the complete coding sequence of the native message, are 
obtained. In order to confirm that the entire cDNA has been obtained, RNA protection 
experiments are performed as follows. Hybridization of a full-length cDNA to an 
mRNA will protect the RNA from RNase degradation. If the cDNA is not full length, 
30 then the portions of the mRNA that are not hybridized will be subject to RNase 
degradation. This is assayed, as is known in the art, by changes in electrophoretic 
mobility on polyacrylamide gels, or by detection of released monoribonucleotides. 
Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., (1989) Cold 
Spring Harbor Press, Cold Spring Harbor, NY. In order to obtain additional sequences 
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5- to the end of a partial cDNA, 5' RACE (PCR Protocols. A Guide to Methods and 
Applications, (1990) Academic Press, Inc.) can be performed. 

Genomic DNA is isolated using the provided polynucleotides in a 
manner similar to the isolation of full-length cDNAs. Briefly, the provided 
polynucleotides, or portions thereof, are used as probes to libraries of genom.c DNA. 
Preferably, the library is obtained from the cell type that was used to generate the 
polynucleotides of the invention, but this is not essential. Most preferably, the genomic 
DNA is obtained from the biological material described herein in the Examples. Such 
libraries can be in vectors suitable for carrying large segments of a genome, such as PI 
or YAC as described in detail in Sambrook et al., 9.4-9.30. In addition, genomic 
sequences can be isolated from human BAC libraries, which are commercially available 
from Research Genetics, Inc., Huntsville, Alabama, USA, for example. In order to 
obtain additional 5' or 3' sequences, chromosome walking is performed, as described m 
Sambrook et al., such that adjacent and overlapping fragments of genomic DNA are 
15 isolated. These are mapped and pieced together, as is known in the art, using restriction 

digestion enzymes and DNA ligase. 

Using the polynucleotide sequences of the invention, corresponding full- 
length genes can be isolated using both classical and PCR methods to construct and 
probe cDNA libraries. Using either method, Northern blots, preferably, are performed 
20 on a number of cell types to determine which cell lines express the gene of mterest at 
the highest level. Classical methods of constructing cDNA libraries are taught in 
Sambrook et al., supra. With these methods, cDNA can be produced from mRNA and 
inserted into viral or expression vectors. Typically, libraries of mRNA compnsmg 
poly(A) tails can be produced with poly(T) primers. Similarly, cDNA libraries can be 
25 produced using the instant sequences as primers. 

PCR methods are used to amplify the members of a cDNA library that 
comprise the desired insert. In this case, the desired insert will contain sequence from 
the full length cDNA that corresponds to the instant polynucleotides. Such PCR 
methods include gene trapping and RACE methods as described in Gruber et al., WO 
95/04745 and Gruber et al., U.S. Patent No. 5,500,356. Kits are commercially available 
to perform gene trapping experiments from, for example, Life Technologies, 
Gaithersburg, Maryland, USA. In preferred embodiments of RACE, a common primer 
is designed to anneal to an arbitrary adaptor sequence ligated to cDNA ends (Apte and 
Siebert, Biotechniques (1993) 75:890-893; Edwards et al., Nuc. Acids Res. (1991) 
35 79:5227-5232). When a single gene-specific RACE primer is paired with the common 
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primer, preferential amplification of sequences between the single gene specific primer 
and the common primer occurs. Commercial cDNA pools modified for use in RACE 

are available. ...... 

The promoter region of a gene generally is located 5' to the initiation site 

for RNA polymerase II. Hundreds of promoter regions contain the "TATA" box, a 
sequence such as TATTA or TATAA, which is sensitive to mutations. The promoter 
region can be obtained by performing 5' RACE using a primer from the coding region 
of the gene. Alternatively, the cDNA can be used as a probe for the genomic sequence, 
and the region 5' to the coding region is identified by "walking up." If the gene is 
highly expressed or differentially expressed, the promoter from the gene can be of use 
in a regulatory construct for a heterologous gene. 

Once the full-length cDNA or gene is obtained, DNA encoding variants 
can be prepared by site-directed mutagenesis, described in detail in Sambrook et al., 
15 3-15 63 The choice of codon or nucleotide to be replaced can be based on disclosure 
herein on optional changes in amino acids to achieve altered protein structure and/or 

As an alternative method to obtaining DNA or RNA from a biological 
material, nucleic acid comprising nucleotides having the sequence of one or more 
polynucleotides of the invention can be synthesized. Thus, the invention encompasses 
nucleic acid molecules ranging in length from 15 nt (corresponding to at least 15 
contiguous nt of one of SEQ ID NOs:l-3351) up to a maximum length suitable for one 
or more biological manipulations, including replication and expression, of the nucleic 
acid molecule. The invention includes but is not limited to (a) nucleic acid having the 
size of a full gene, and comprising at least one of SEQ ID NOs:l-3351; (b) the nucleic 
acid of (a) also comprising at least one additional polynucleotide or gene, operably 
linked to permit expression of a fusion protein; (c) an expression vector comprising (a) 
or (b)- (d) a plasmid comprising (a) or (b) ; and (e) a recombinant viral particle 
comprising (a) or (b). Once provided with the polynucleotides disclosed herein, 
construction or preparation of (a) - (e) are well within the skill m the art. 
, The sequence of a nucleic acid comprising at least 1 5 contiguous nt of at 

least any one of SEQ IDNOs:l-3351, preferably the entire sequence of at least any one 
of SEQ ID NOs:l-3351, is not limited and can be any sequence of A, T, G, and/or C 
(for DNA) and A, U, G, and/or C (for RNA) or modified bases thereof, including 
inosine and pseudouridine. The choice of sequence will depend on the desired function 
and can be dictated by coding regions desired, the intron-like regions desired, and the 
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regulatory regions desired. Where the entire sequence of any one of SEQ ID NOs:l- 
3351 is within the nucleic acid, the nucleic acid obtained is referred to herein as a 
polynucleotide comprising the sequence of any one of SEQ ID NOs: 1-3351. 

p. ^»n »f Pnlvneotide Encod edMFuiLLen pth cDNA or Full-Length Gene 

The provided polynucleotides (e.g., a polynucleotide having a sequence 
of one of SEQ ID NOs:l-3351), the corresponding cDNA, or the full-length gene is 
used to express a partial or complete gene product. Constructs of polynucleotides 
having sequences of SEQ ID NOs:l-3351 can be generated synthetically. Alternatively 
single-step assembly of a gene and entire plasmid from large numbers of 
oligodeoxyribonucleotides is described by, Stemmer et al., Gene ^sterdarn) 
(1995) 7^(7)49-53. In this method, assembly PCR (the synthesis of long DNA 
sequences from large numbers of oligodeoxyribonucleotides (oligos)) is described. The 
method is derived from DNA shuffling (Stemmer, Nature (1994) 370:389-391) and 
does not rely on DNA ligase, but instead relies on DNA polymerase to build 
15 increasingly longer DNA fragments during the assembly process. 

Appropriate polynucleotide constructs are purified using standard 
recombinant DNA techniques as described in, for example, Sambrook et al., Molecular 
Cloning: A Laboratory Manual, 2nd Ed., (1989) Cold Spring Harbor Press, Cold Spring 
Harbor NY, and under current regulations described in United States Dept. of HHS, 
20 National Institute of Health (NIH) Guidelines for Recombinant DNA Research. The 
gene product encoded by a polynucleotide of the invention is expressed in any 
expression system, including, for example, bacterial, yeast, insect, amphibian and 
mammalian systems. Vectors, host cells and methods for obtaining expression ,n same 
are well known in the art. Suitable vectors and host cells are described in U.S. Patent 
25 No. 5,654,173. 

Polynucleotide molecules comprising a polynucleotide sequence 
provided herein are generally propagated by placing the molecule in a vector. Viral and 
non-viral vectors are used, including plasmids. The choice of plasmid will depend on 
the type of cell in which propagation is desired and the purpose of propagation. Certain 
30 vectors are useful for amplifying and making large amounts of the desired DNA 
sequence. Other vectors are suitable for expression in cells in culture. Still other 
vectors are suitable for transfer and expression in cells in a whole animal or person. The 
choice of appropriate vector is well within the skill of the art. Many such vectors are 
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available commercially. Methods for preparation of vectors comprising a desired 
sequence are well known in the art. 

The polynucleotides set forth in SEQ ID NOs: 1-3351 or their 
corresponding full-length polynucleotides are linked to regulatory sequences as 

5 appropriate to obtain the desired expression properties. These can include promoters 
(attached either at the 5' end of the sense strand or at the 3' end of the antisense strand), 
enhancers, terminators, operators, repressors, and inducers. The promoters can be 
regulated or constitutive. In some situations it may be desirable to use conditionally 
active promoters, such as tissue-specific or developmental stage-specific promoters. 
10 These are linked to the desired nucleotide sequence using the techniques described 
above for linkage to vectors. Any techniques known in the art can be used. 

When any appropriate host cells or organisms are used to replicate 
and/or express the polynucleotides or nucleic acids of the invention, the resulting 
replicated nucleic acid, RNA, expressed protein or polypeptide, is within the scope of 

1 5 the invention as a product of the host cell or organism. The product is recovered by any 

appropriate means known in the art. 

Once the gene corresponding to a selected polynucleotide is identified, 
its expression can be regulated in the cell to which the gene is native. For example, an 
endogenous gene of a cell can be regulated by an exogenous regulatory sequence as 
20 disclosed in U.S. Patent No. 5,641 ,670. 

Identification of Functional a nd Structural Motifs of Novel Genes 

Translations of the nucleotide sequence of the provided polynucleotides, 
cDNAs or full genes can be aligned with individual known sequences. Similarity with 
individual sequences can be used to determine the activity of the polypeptides encoded 
25 by the polynucleotides of the invention. Also, sequences exhibiting similarity with 
more than one individual sequence can exhibit activities that are characteristic of either 

or both individual sequences. 

The full length sequences and fragments of the polynucleotide sequences 
of the nearest neighbors can be used as probes and primers to identify and isolate the 
30 full length sequence corresponding to provided polynucleotides. The nearest neighbors 
can indicate a tissue or cell type to be used to construct a library for the full-length 
sequences corresponding to the provided polynucleotides. 

typically, a selected polynucleotide is translated in all six frames to 
determine the best alignment with the individual sequences. The sequences disclosed 
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herein in the Sequence Listing are in a 5' to 3' orientation and translation in three 
frames can be sufficient. These amino acid sequences are referred to, generally, as 
query sequences, which will be aligned with the individual sequences. Databases w,th 
individual sequences are described in "Computer Methods for Macromolecular 
Sequence Analysis" Methods in Enzymology (1996) 266, Doolittle, Academic Press, 
Inc., a division of Harcourt Brace & Co., San Diego, California, USA. Databases 
include Genbank, EMBL, and DNA Database of Japan (DDBJ). 

Query and individual sequences can be aligned using the methods and 
computer programs described above, and include BLAST, available over the world 
wide web at http://www.ncbi.nlm.nhi.gov/BLAST. Another alignment algorithm is 
Fasta available in the Genetics Computing Group (GCG) package, Madison, 
Wisconsin, USA, a wholly owned subsidiary of Oxford Molecular Group, Inc. Other 
techniques for alignment are described in Doolittle, supra. Preferably, an alignment 
program that permits gaps in the sequence is utilized to align the sequences. The 
Smith-Waterman is one type of algorithm that permits gaps in sequence alignments. 
See Meth. Mol. Biol. (1997) 70: 173-187. Also, the GAP program using the Needleman 
and Wunsch alignment method can be utilized to align sequences. An alternative search 
strategy uses MPSRCH software, which runs on a MASPAR computer. MPSRCH uses 
a Smith-Waterman algorithm to score sequences on a massively parallel computer. 
20 This approach improves ability to identify sequences that are distantly related matches, 
and is especially tolerant of small gaps and nucleotide sequence errors. Amino acid 
sequences encoded by the provided polynucleotides can be used to search both protein 

and DNA databases. 

High Similarity . In general, in alignment results considered to be of high 
25 similarity, the percent of the alignment region length is typically at least about 55% of 
total length query sequence; more typically, at least about 58%; even more typically; at 
least about 60% of the total residue length of the query sequence. Usually, percent 
length of the alignment region can be as much as about 62%; more usually, as much as 
about 64%; even more usually, as much as about 66%. Further, for high similarity, the 
30 region of alignment, typically, exhibits at least about 75% of sequence .dentity; more 
typically, at least about 78%; even more typically; at least about 80% sequence identity. 
Usually, percent sequence identity can be as much as about 82%; more usually, as much 
as about 84%; even more usually, as much as about 86%. 

The p value is used in conjunction with these methods. If high similarity 
35 is found, the query sequence is considered to have high similarity with a profile 
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sequence when the p value is less than or equal to about 10" 2 ; more usually; less than or 
equal to about 10" 3 ; even more usually; less than or equal to about \0f*. More typically, 
the p value is no more than about 10" 5 ; more typically; no more than or equal to about 
10- 10 ; even more typically; no more than or equal to about 10' 5 for the query sequence 
to be considered high similarity. 

Similarity Determined bv Seq uence Ide ntity Alone. Sequence identity 
alone can be used to determine similarity of a query sequence to an individual sequence 
and can indicate the activity of the sequence. Such an alignment, preferably, permits 
gaps to align sequences. Typically, the query sequence is related to the profile sequence 
if the sequence identity over the entire query sequence is at least about 15%; more 
typically, at least about 20%; even more typically, at least about 25%; even more 
typically, at least about 50%. Sequence identity alone as a measure of similarity is most 
useful when the query sequence is usually, at least 80 residues in length; more usually, 
90 residues; even more usually, at least 95 amino acid residues in length. More 
1 5 typically, similarity can be concluded based on sequence identity alone when the query 
sequence is preferably 100 residues in length; more preferably, 120 residues in length; 
even more preferably, 1 50 amino acid residues in length. 

Alignments with Profile and Multiple Aliened Sequences. Translations 
of the provided polynucleotides can be aligned with amino acid profiles that define 
20 either protein families or common motifs. Also, translations of the provided 
polynucleotides can be aligned to multiple sequence alignments (MSA) comprising the 
polypeptide sequences of members of protein families or motifs. Similarity or identity 
with profile sequences or MSAs can be used to determine the activity of the gene 
products (e.g., polypeptides) encoded by the provided polynucleotides or corresponding 
25 cDNA or genes. For example, sequences that show an identity or similarity with a 
chemokine profile or MSA can exhibit chemokine activities. 

Profiles can be designed manually by (1) creating an MSA, which is an 
alignment of the amino acid sequence of members that belong to the family and (2) 
constructing a statistical representation of the alignment. Such methods are described, 
30 for example, in Bimey et al., Nucl. Acid Res. (1996) 24(14): 2730-2739. MSAs of some 
protein families and motifs are publicly available. MSAs are described also in 
Sonnhammer et al., Proteins (1997) 28: 405-420. A brief description of MSAs is 
reported in Pascarella et al., Prot. Eng. (1996) 9(J):249-251. Techniques for building 
profiles from MSAs are described in Sonnhammer et al., supra; Birney et al., supra; 
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and "Computer Methods for Macromolecular Sequence Analysis," Methods in 
Enzymology (1996) 266, Doolittle, Academic Press, Inc., San Diego, California, USA. 

Similarity between a query sequence and a protein family or motif can be 
determined by (a) comparing the query sequence against the profile and/or (b) aligning 
the query sequence with the members of the family or motif. Typically, a program such 
as Searchwise is used to compare the query sequence to the statistical representation of 
the multiple alignment, also known as a profile (see Bimey et al., supra). Other 
techniques to compare the sequence and profile are described in Sonnhammer et al., 

supra and Doolittle, supra. 

Next, methods described by Feng et al., J. Mol. Evol. (1987) 25:351 and 
Higgins et al., CABIOS (1989) 5:151 can be used align the query sequence with the 
members of a family or motif, also known as a MSA. Sequence alignments can be 
generated using any of a variety of software tools. Examples include PileUp, which 
creates a multiple sequence alignment, and is described in Feng et al., J. Mol. Evol. 
15 (1987) 25:351 . Another method, GAP, uses the alignment method of Needleman et al., 
j MoL Bio i (1970) 45:443. GAP is best suited for global alignment of sequences. A 
third method, BestFit, functions by inserting gaps to maximize the number of matches 
using the local homology algorithm of Smith et al., Adv. Appl. Math. (1981) 2:482. In 
general, the following factors are used to determine if a similarity between a query 
20 sequence and a profile or MSA exists: (1) number of conserved residues found in the 
query sequence, (2) percentage of conserved residues found in the query sequence, (3) 
number of frameshifts, and (4) spacing between conserved residues. 

Some alignment programs that both translate and align sequences can 
make any number of frameshifts when translating the nucleotide sequence to produce 
25 the best alignment. The fewer frameshifts needed to produce an alignment, the stronger 
the similarity or identity between the query and profile or MSAs. For example, a weak 
similarity resulting from no frameshifts can be a better indication of activity or structure 
of a query sequence, than a strong similarity resulting from two frameshifts. Preferably, 
three or fewer frameshifts are found in an alignment; more preferably two or fewer 
frameshifts; even more preferably, one or fewer frameshifts; even more preferably, no 
frameshifts are found in an alignment of query and profile or MS As. 

Conserved residues are those amino acids found at a particular position 
in all or some of the family or motif members. Alternatively, a position is considered 
conserved if only a certain class of amino acids is found in a particular position in all or 
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some of the family members. For example, the N-terminal position can contain a 
positively charged amino acid, such as lysine, arginine, or InsUdine. 

Typically, a residue of a polypeptide i, conserved when a class of amino 
acids or a single amino acid is found at a particular position in at least about 40% of al 
; Is members; more typically, at least about 50%; even more typically, at least abo* 
60% of the members. Usually, a residue is conserved when a class or angle ammo ac.d 
is found in at least about 70% of the members of a family or mot* more usuaHy, at 
least about 80%; even more usually, at least about 90%; even more usually, at least 
about 95%. ^ ^ ^ idered ^ tiaes unrdated ^ adds „ 

found at a particular position in the some or all of the members; more usually, two 
- unrelated amino acids. These residues are conserved when the unrelated ammo acds 
are found at particular positions in at least about 40% of all class member; more 
Really, at least about 50%; even more typically, at least about 60% of the members 
5 uLlly a residue is conserved when a class or single amino acid ,s found m at least 
abot 70% of the members of a family or motif; more usually, at least about 80%; even 
more usually, at least about 90%; even more usually, at least about 95%. 

A query sequence has similarity to a profile or MSA when the query 
sequence comprises at least about 25% of the conserved residues of the profile or MS A.; 
» more usually, at least about 30%; even more usually; at least about 40%. Typ.cally, the 

^ * . ^ * a ^ ^ or MS a ^ ^ 

sequence comprises at least about 45% of the conserved residues of the profile or MSA, 
more typically, at least about 50%; even more typically; at least about 55 A. 

Wn.ifiration of Seer etgd ™H Membrane-Boiinri Polypeptides 
„ " Both secreted and membrane-bound polypeptides of the present 

invention are of particular interest. For example, levels of secreted polypeptides can be 
assayed in body fluids that are convenient, such as blood, plasma, serum, and other 
body fluids such as urine, prostatic fluid and semen. Membrane-bound polypepudes are 
useful for constructing vaccine antigens or inducing an immune response. Such 
30 antigens would comprise a„ or part of the extracellular region of the membrane-bound 
polypeptides. Because both secreted and membrane-bound polypept.des comprise a 
fragment of contiguous hydrophobic amino acids, hydrophobicity predicting algorithms 
can be used to identify such polypeptides. 

fl 
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A signal sequence is usually encoded by both secreted and membrane- 
bound polypeptide genes to direct a polypeptide to the surface of the cell. The signal 
sequence usually comprises a stretch of hydrophobic residues. Such signal sequences 
can fold into helical structures. Membrane-bound polypeptides typically comprise at 
5 least one transmembrane region that possesses a stretch of hydrophobic amino acids that 
can transverse the membrane. Some transmembrane regions also exhibit a helical 
structure Hydrophobic fragments within a polypeptide can be identified by using 
computer algorithms. Such algorithms include Hopp & Woods, Proc. Natl. Acad. Sou 
USA (1981) 7*3824-3828; Kyte & Doolittle, J. Mol. Biol. (1982) 157: 105-132; and 
10 RAOAR algorithm, Degli Esposti et al., Eur. J. Biochem. (1990) 190: 207-219. 

Another method of identifying secreted and membrane-bound 
polypeptides is to translate the polynucleotides of the invention in all six frames and 
determine if at least 8 contiguous hydrophobic amino acids are present. Those 
translated polypeptides with at least 8; more typically, 10; even more typically 12 
l5 contiguous hydrophobic amino acids are considered to be either a putative secreted or 
membrane bound polypeptide. Hydrophobic amino acids include alanme, glycine, 
histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, threonine, 
tryptophan, tyrosine, and valine 

T^wifWinr, of th e *..n,.inn of an Expression Product of. Full-Length Gene 
20 Ribozymes, antisense constructs, and dominant negative mutants can be 

used to determine function of the expression product of a gene corresponding to a 
polynucleotide provided herein. The phosphoramidite method of oligonucleotide 
synthesis can be used to construct antisense molecules and ribozymes. See Beaucage et 
al Tet. Lett (1981) 22:1859 and U.S. Patent No. 4,668,777. Automated dev.ces for 
25 synthesis are available to create oligonucleotides using this chemistry. Examples of 
such devices include Biosearch 8600, Models 392 and 394 by Applied Biosystems, a 
division of Perkin-Elmer Corp., Foster City, California, USA; and Expedite by 
Perceptive Biosystems, Framingham, Massachusetts, USA. Synthetic RNA, phosphate 
analog oligonucleotides, and chemically derivatized oligonucleotides can also be 
30 produced, and can be covalently attached to other molecules. RNA oligonucleotides 
can be synthesized, for example, using RNA phosphoramidites. This method can be 
performed on an automated synthesizer, such as Applied Biosystems, Models 392 and 
394, Foster City, California, USA. 
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Oligonucleotides of up lo 200 nt can be synthesized, mote typically, 1 00 
„,, more typically 50 nt; even mote typically 30 to 40 nt. These synthetic fragments can 
anneal and ligated ,oge«her ,o consteuc, large, fragments See, for exarnp^ 
SambrooK e, a,., supra. Trans-cleuving catalytic RNAs (nbozymes are RN A 
5 molecules possessing endoribonuelease activity. Ribozymes are specially de tgned 
for a particular targe, and the target message must contain a spec,«c ^ nucl o, d 
sequence. They are engineered to cleave any RNA spee.es s „e-spec,f,cally n the 
Aground of cellular RNA. The cieavage even, renders the mRNA unstable and 
prevents protein expression. .mp^rtanUy, ribozymes can be used to inh,b,« express,™, 
,0 of agene of unknown function for the purpose of determining its funcon ,n an m ««, 
or in vivo context, by detecting the phenotypie effect. rna 
Antisense nucleic acids are designed to specially bmd to RNA 
resulting in the formation of RNA-DNA or RNA-RNA hybrids, wirh an arrest o DNA 
mplicat on, reverse transcription or messenger R*A nansfchon. An .sen. 
,5 polynucleotides b*sed on a selected polynucleotide sequence an mterfere w,th 
expression of the corresponding gene. Andsense polynucleohdes are typteaUy 
generated within the cell by expression from an.isense constructs d,at c„„ B ,„ dte 
Lisense strand as tbe transcribed strand. Antisense polynucleotides - 
disclosed polynucleotides will bind and/or interfere with the translation of mRNA 
20 comprising a Louence complementary to the antisense polynucleodde. The express^ 
produces of connol cells and cells treated with the antisense construct are compared to 
detect the protein product of the gene corresponding to the polynucleot.de upon wh.ch 
le antisense consLc, is based. The P ro,ei„ is isolated and iden.if.ed usmg rontme 

biochemical methods. .... :„ 

„ Given the extensive background literature and chmcal expenence ui 

antisense therapy, one skilled in the art can use selected polynucleoudes of the 
invention as additional potential therapeutics. The cho.ce of ^ of 

narrowed by first testing them for binding to "hot spot" regions of the genome of 
cancerous cdls. If a polynucleotide i. .dentified as binding to a "hot spot," tesUng the 
30 polynucleotide as an antisense compound in the corresponding cancer cells » 

Dominant negative mutations also are readily generated for 
corresponding proteins that are active as homomultimers. A mutant polypeptide w.11 
interact with wild-type polypeptides (made from the other allele) and form a non- 
35 functional multime. Thus, a mutation is in a substrate-binding domain, a catalytic 
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domain, or a cellular localization domain. Preferably, the mutant polypeptide w,ll be 
overproduced. Point mutations are made that have such an effect. In addition, fusion of 
different polypeptides of various lengths to the terminus of a protein can yield dominant 
negative mutants. General strategies are available for making dominant negative 
mutants (see, e.g., Herskowitz, Nature (1987) 529:219). Such techniques can be used to 
create loss of function mutations, which are useful for determining protein function. 



polypeptides and Variants Thereof 

The polypeptides of the invention include those encoded by the disclosed 
polynucleotides, as well as nucleic acids that, by virtue of the degeneracy of the genetic 
10 code, are not identical in sequence to the disclosed polynucleotides. Thus, the invention 
includes within its scope a polypeptide encoded by a polynucleotide having the 
sequenceofanyoneofSEQIDNOs:l-3351 oravariantthereof. 

in general, the term "polypeptide" as used herein refers to both the full 
length polypeptide encoded by the recited polynucleotide, the polypeptide encoded by 
15 the gene represented by the recited polynucleotide, as well as portions or fragments 
thereof "Polypeptides" also includes variants of the naturally occurring proteins, where 
such variants are homologous or substantially similar to the naturally occurring protein, 
and can be of an origin of the same or different species as the naturally occurring 
protein (e.g., human, murine, or some other species that naturally expresses the recited 
20 polypeptide, usually a mammalian species). In general, variant polypeptides have a 
sequence that has at least about 80%, usually at least about 90%, and more usual* at 
least about 98% sequence identity with a differentially expressed polypeptide of the 
invention, as measured by BLAST using the parameters described above. The variant 
polypeptides can be naturally or non-natura.ly glycosylated, i.e., the polypeptide has a 
25 glycosylate pattern that differs from the glycosylate pattern found in the 
corresponding naturally occurring protein. 

The invention also encompasses homologs of the disclosed polypeptides 
(or fragments thereof) where the homologs are isolated from other species, i.e., other 
animal or plant species, where such homologs, usually mammalian species, e.g., 
30 rodents, such as mice, rats; domestic animals, e.g., horse, cow, dog, cat; and humans. 
By "homolog" is meant a polypeptide having at least about 35%, usually at least about 
40% and more usually at least about 60% amino acid sequence identity to a particular 
differentially expressed protein as identified above, where sequence ident.ty is 
determined using the BLAST algorithm, with the parameters described above. 
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,„ general, the polypeptides of the subject invention are provided in * 
non naturally occurring environment, e.g., are separated from then natunuly oecumng 

I; stean, L, iess than 90%, usu* .ess than 60% and more usually ess than 50/. 
of Ore composition is made np of non-d.fferen.iaUy expressed polypep .des. 

Also width, the scope of the invenuon are vanants vanants of 

I mber of a pro,ein family, a region assented with a _ ^enoe ^ 
20 of amino acid alterations for production of variants c» be ba^d up™ * J 
(interior vs. exterior) of the amino acid (see, eg., Go et al„ Int. J. reputt 

980) /5-2H) the therrnoshmility of me variant polypeptide (see, ,g .. Querol et a, 
To. E g< 19 « >:265), desired glycosylahon sites (see, og Olsen and ^omsen , 
Ge , JLLl. (.99.) 157:57% desired disulf.de bridges (see e.g C he a 
" , ■ . nocm ?2-4322- and Wakarchuk et al., Protein Eng. (1994) 7.MIV), 

particularly biologically active fragments and/o, fragments corresponding .o ~d 
d^ nf Fragmenu of interest wi.l typically be at ,eas. about 10 aa to a. least about 
11 g. Tally at leas, about 50 aa in length, and can be as long as 30 aa m leng 
Z o ge • - will usually no. exceed about 1000 aa in length where the .talent ™, 
hive a sbe.0 of amino acids .ha, is identical to a po.ypephde encoded by 
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~c o„„ qfo ID NOs l-3351, or a homolog thereof. 

"he The genetic code can he used .o selec, ft. approp„a,e codons «o 

construe, the corresponding variants. 

) r-T prHT-Rc' -"" ""hodiments -mience 
fn general, a library of polynucleotides ,s a eolleetton of seqnenee 
information which information is provided in either biochemical form <«.* as a 
ct™ rof polynndcotide molecnles), or in electronic form (e.g., as , collectton of 
sequences stored in a compose £ as in = r sysmm 

marker is a representation of a gene product that is present » f^* 

a Ao^oo&A level relative to a normal cell (e.g., a ceii 01 
i <; Hi«ea<?e either at an ncreased or decreased level rei<mvc iu 

" e let similar type that ,s no, substantially affected by disease). For example a 
l„-^de sequel in a library can be a ^nucleotide tha, represents . mRK A 
n, other «ne product encoded by the polynucleotide, that ts etther 
Tv" d or^XsLd in a breas, due, cel, affected by cancer relative to a 

u form ee electronic or biochemical forms. For example, a library ol 
— "m in electronic form comprises an accessible computer 

Z Bl= ( r in biochemical form, a collection of nucleic acid molecules) tha, — 
2S htre e en, tivc nucleotide sequences of genes tha, are differentially expressed ^ 
25 tne reprcfccmat pvamoie a cancerous cell and a 

overexpressed or undepressed) as between, for examp e, i; 

ii o Hvmlastic cell- iii) a cancerous cell and a cen 
nnrmal cell- ii) a cancerous cell and a dyspiastic cen, my « 

ff" » d Lease or condition other man cancer; iv) a metastanc cancerous cell and 

normal eel, and/or non-me.as.a,ic cancerous eel,; v) a malignant cancerous cell and a 

Biochemical embodiments of the lihrar, include a collection of nucletc actds ma, have 

7> 
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the seances of the genes in the library, where the nucleic acids ™ 

entire gene in the library or to a fragment thereof, as described m greater de tail below _ 

The polynucleotide libraries of the subject invention generally compnse 
sequence information of a plurality of polynucleotide sequences, where at least one of 
sequence imuin.a r „ f5 cn in NOs-1-3351. By plurality is 

5 the polynucleotides has a sequence of any of SEQ ID NO s.l _ 3351 y P 

meant at least 2, usually at least 3 and can include up to all of SEQ ID NOs.l 335 L 
ThH ngth and number of polynucleotides in the library will vary with the nature of the 
Ubrary e. g ., if the library is an oligonucleotide array, a cDNA array, a computer 

database of the sequence information, etc. seaU ence 
10 Where the library is an electronic library, the nucleic acid sequence 

information can be present in a variety of media. "Media" refers to a manufacture 
otnTtL an isolated nucleic acid molecule, that contains the sequence information o 

«nt invention. Such a manufacture provides the genome sequence or a subset 
the present in ^ tQ ^ sequence 

thereof in a form that can oe examine u y nrP c e nt 
,5 as i, exists in a nude* acid. For example, the nucleotide sequence of the presen, 
TJLon eg. .he nncleic acid sequences of any of .he polynucleotides of SEQ ID 

Id accessed direct* by a colter. Such media include, bu, are no, l.mtted «x 
magnl storage media, such as a floppy disc, a hard disc storage medmm, and a 
20 magnetic tape; optica! storage media such as CD-ROM; electrica. storage medta such * 
ftZ and ROM; and hybrids of these categories such as magnettc/opttca storage 
One of s ill in the ar, can readily appreciate how any of the presently known 
Timer readable mediums can be used to create a manufacture compnsmg a recordm 
ofT present sequence information. -Recorded- refers to a process for smrmg 
25 tZZ on computer readable medium, using any such methods as anow. m the £ 
Any convenient data storage structure can he chosen, based on the menus used to access 
Stored information. A variety of data processor programs and formatsca, , b used 
for storage, e.g., word processing .ex. flic, database format etc. In add .ton to the 
eql c infomiatton, electronic versions of the libraries of ^^"^ 
30 plded in conjonetion or connection with other computer-readable tnformauo and-or 
other types of computer-readable flics (e.g., searchable flies, executable files, «... 
including, bu.no. limited to, for example, search program software, etc.) 

By providing the nucleotide sequence in computer readable form, the 
information can be accessed for a variety of purpose, Computer software to access 
35 seance tnforntation is publicly available. For exantple, .he BLAST (AHschu. at .... 
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supra ) and BLAZE (Bruilag et al. Omp. Chem. (1993) 77:203) search algorithms on a 
Sybase system can he used .o IduHV open reading frames (ORFs) w,,hin ft. genome 
ftat contain homology to ORFs from other organisms. 

As used herein, "a computer-based system" refers to the hardware 
5 means, software menus, and data stomge means used to analyze the nndeotide ^ce 
information of the present invention. The minimum hardware of 
systems of the present invention comprises a centra, processing unit (CPU), ,npu, 
m ans, output mis, and data storage means. A skill*, ariisan can read, y apprec, 
fta, any one of fte currently available computer-based system are satiable for use m the 
,0 present invention. The data storage means can comprise any manufacture compnsmg a 
^rlg of fte present seance informal as described above, or a memory access 
means that can access such a manufacture. 

"Search means" refers to one or more programs tmplemented on the 
computer-based system, to compare a targe, sequence or target structural mot, f or 
, 5 Zession levels of a polynucleotide in a sample, with the stored sequence mformanon. 
Z h means can be used to identify frogmen, or regions of the genome tha, match a 
Lcniar targe, sequence or targe, motif. A variety of "—7/** 
known and commercially available, e.g., MacPattem (EMBL), BLASTO and BLASTX 
(NCBI) A "iarget sequence" can be any polynncleo.ide or ammo acd sequence of sx 
20 Tmona comigul nucleotides or two or more amino acids, preferably from about 0 
0 ,00 amino acids or from about 30 to 300 nt. A variety of comparmg ««.«*. 
used to accomplish comparison of sequence information from a sample (e.g , .0 analyze 
Tge. sequenL, targe, motifs, or revive expression levels) with the dab, storage 
^sTskilled Jsan can readily recognize fta, any one of.be publ.cly avadable 
25 ZZ logy search programs can he used as fte search means for fte compote, base 
systems of fte present invention to accomplish comparison of target sequence and 
monT Computer programs to analyze expression levels in a sample and m controls are 
aiso known infte^ ^ ^ ^ ^ ^ ^ ^ ^ ^ 

30 selected sequence or combinaiion of sequences in which .he sequence(s) are chosen 
ba"d on a hree-dimens.ona, configuration fta. is formed upon the foldmg of the targ 
mo,if, or on consensus sequences of rego,a,o,y or active si.es. There are a vane* of 
Ige mo,ifs knowr, in fte art. Pro.ein ,arge, mo,ifs inciude, bu. arc no. hmned * 
enzyme acbve sues and signal sequences. Nucleic add unge. monfs mclude, bu. are 
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not limited to, hairpin structures, promoter sequences and other expression elements 
such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be 
used to input and output the information in the computer-based systems of the present 
5 invention. One format for an output means ranks the relative expression levels of 
different polynucleotides. Such presentation provides a skilled artisan with a ranking of 
relative expression levels to determine a gene expression profile. 

As discussed above, the "library" of the invention also encompasses 
biochemical libraries of the polynucleotides of SEQ ID NOs:l-3351, e.g., collections of 
10 nucleic acids representing the provided polynucleotides. The biochemical libraries can 
take a variety of forms, e.g., a solution of cDNAs, a pattern of probe nucleic ^ac.ds stably 
associated with a surface of a solid support (,e., an array) and the ^, Of particul 
interest are nucleic acid arrays in which one or more of SEQ ID NOs:l-3 51 is 
represented on the array. By array is meant an article of manufacture that has at least a 
15 substrate with at least two distinct nucleic acid targets on one of its surfaces, where the 
number of distinct nucleic acids can be considerably higher, typically being at least 10 
nt usually at least 20 nt and often at least 25 nt. A variety of different a,ray formats 
have been developed and are known to those of skill in the art. The arrays of the subject 
invention find use in a variety of applications, including gene expression analys.s, drug 
20 screening, mutation analysis and the like, as disclosed in the above-listed exemplary 

patent documents. , 
In addition to the above nucleic acid libraries, analogous libraries of 
polypeptides are also provided, where the where the polypeptides of the library will 
represent at least a portion of the polypeptides encoded by SEQ ID NOs:l-3351 . 

05 nf Polvnucl e ^ Probes in Mapping and in Tissue Profiling 

Po , ynU cleotide probes, generally comprising at least 12 contiguous nt of 
a polynucleotide as shown in the Sequence Listing, are used for a variety of purposes, 
such as chromosome mapping of the polynucleotide and detection of transcription 
,evels Additional disclosure about preferred regions of the disclosed polynucleot.de 

30 sequences is found in the Examples. A probe that hybridizes specifically to a 
polynucleotide disclosed herein should provide a detection signal at least 5-, 10-, or 20- 
fold higher than the background hybridization provided with other unrelated sequences. 

Detection nf Fv pr.s S ion Levels . Nucleotide probes are used to detect 
expression of a gene corresponding to the provided polynucleotide. In Northern blots, 
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mRNA is separated electrophoretically and contacted with a probe. A probe is detected 
Sizing to an mRNA species of a partial size. The amount of hybndization is 
quantised to determine relative amounts of expression, for example under a pabular 
condition. P-bes are used for in situ hybndization to cells to detect express.on. Pro s 
; can also be used in vivo for diagnostic detection of hybridizing sequences. Probes are 
typically labeled with a radioactive isotope. Other types of detectable labels an be 
Ted such as chromophores, floors, and enzymes. Other examples 
hybridization assays are described in WO92/02526 and U.S. Patent No 5,124 246. 

Alternatively, the Polymerase Chain Reaction (PGR) » another means 
0 for detecting small amounts of target nucleic acids (see, e.g., Mullis et a. M£ 
Enzymol (1987) 755:335; U.S. Patent No. 4,683,195; and U.S. Patent No 4,683 202). 
Z primer polynucleotides nucleotides that hybridize with the target nucleic acids are 
used to prime the reaction. The primers can be composed of sequence wthin or 3 «d 
5' to the polynucleotides of the Sequence Listing. Alternatively, if the primers are 3 and 
5 5' to these polynucleotides, they need not hybridize to them or the complements. After 
ampliation of the target with a thermostable polymerase, the amplified target^ucle, 
acids can be detected by methods known in the art, e.g., Southern blot. mRNA or 
cDNA can also be detected by traditional blotting techniques (e.g., Southern blot, 
Northern blot, ere.) described in Sambrook et al., "Molecular Cloning: A Laboratory 
20 Manual" (New York, Cold Spring Harbor Laboratory, 1989) (e.g without PCR 
amplification). In general, mRNA or cDNA generated from mRNA using a polymerase 
enzyme can be purified and separated using gel electrophoresis and ™^ • 
soTd support, such as nitrocellulose. The solid support i, exposed to a ^I d probe 
washed to remove any unhybridized probe, and duplexes containing the labeled probe 

25 are detected. , „. tn 

Mapping . Polynucleotides of the present invention can be used to 

identify a chromosome on which the corresponding gene resides. Such mapping can be 

useful in identifying the function of the po.ynucleotide-related gene by its proximny -to 

other genes with known function. Function can also be assigned to the polynucleoUde- 

30 related gene when particular syndromes or diseases map to the same chromosome. Fo 
example use of polynucleotide probes in identification and quantification of nucleic 
add sequence aberrations is described in U.S. Patent No. 5,783,387. An exemplary 
napping method i, fluorescence in situ hybridization (FISH), which facilitates 
comparative genomic hybridation to allow total genome assessment of changes m 

35 relative copy number of DNA sequences (see, e.g., Valdes et al., Methods ,n Molecular 
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Biology (1997) 68:1). Polynucleotides can also be mapped to particular chromosomes 
using, for example, radiation hybrids or chromosome-specific hybrid panels. See Leach 
et al Advances in Genetics, (1995) 33:63-99; Walter et al., Nature Genetics (1994) 
7-22- Walter and Goodfellow, Trends in Genetics (1992) 9:352. Panels for radiation 
5 hybrid mapping are available from Research Genetics, Inc., Huntsville, Alabama, USA. 
The statistical program RHMAP can be used to construct a map based on the data from 
radiation hybridization with a measure of the relative likelihood of one order versus 
another. RHMAP is available via the world wide web at httpV/www.sph.umich^du- 
/group/statgen/software. In addition, commercial programs are available for identifying 
10 regions of chromosomes commonly associated with disease, such as cancer. 

T; gc „ P Tvning or Profiling . Expression of specific mRNA 
corresponding to the provided polynucleotides can vary in different cell types and can 
be tissue-specific. This variation of mRNA levels in different cell types can be 
exploited with nucleic acid probe assays to determine tissue types. For example, PCR, 
15 branched DNA probe assays, or blotting techniques utilizing nucleic add probes 
substantially identical or complementary to polynucleotides listed in the Sequence 
Listing can determine the presence or absence of the corresponding cDNA or mRNA. 

Tissue typing can be used to identify the developmental organ or tissue 
source of a metastatic lesion by identifying the expression of a particular marker of that 
organ or tissue. If a polynucleotide is expressed only in a specific tissue type, and a 
metastatic lesion is found to express that polynucleotide, then the developmental source 
of the lesion has been identified. Expression of a particular polynucleotide can be 
assayed by detection of either the corresponding mRNA or the protein product. 

TT^nf Polymorphisms. A polynucleotide of the invention can be used in 
forensics, genetic analysis, mapping, and diagnostic applications where the 
corresponding region of a gene is polymorphic in the human population. Any means for 
detecting a polymorphism in a gene can be used, including, but not limited to 
electrophoresis of protein polymorphic variants, differential sensitivity to reaction 
enzyme cleavage, and hybridization to allele-specific probes. 
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30 Antibody Production 

Expression products of a polynucleotide of the invention, as well as the 
corresponding mRNA, cDNA, or complete gene, can be prepared and used for raising 
antibodies for experimental, diagnostic, and therapeutic purposes. For polynucleotide* 
to which a corresponding gene has not been assigned, this provides an additional 
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method of identifying the corresponding gene. The polynucleotide or related cDNA is 
expressed as described above, and antibodies are prepared. These antibodies are 
specific to an epitope on the polypeptide encoded by the polynucleotide, and can 
precipitate or bind to the corresponding native protein in a cell or tissue preparation or 
5 in a cell-free extract of an in vitro expression system. 

Methods for production of monoclonal and polyclonal antibodies that 
specifically bind a selected antigen are well known in the art. The antibodies 
specifically bind to epitopes present in the polypeptides encoded by polynucleotides 
disclosed in the Sequence Listing. Typically, at least 6, 8, 10, or 12 contiguous amino 
10 acids are required to form an epitope. Epitopes that involve non-contiguous amino 
acids may require a longer polypeptide, e.g., at least 15, 25, or 50 amino acids. 
Antibodies that specifically bind to human polypeptides encoded by the provided 
polynucleotides should provide a detection signal at least 5-, 10-, or 20-fold higher than 
a detection signal provided with other proteins when used in Western blots or other 
15 immunochemical assays. Preferably, antibodies that specifically polypeptides of the 
invention do not bind to other proteins in immunochemical assays at detectable levels 
and can immunoprecipitate the specific polypeptide from solution. 

The invention also contemplates naturally occurring antibodies specific 
for a polypeptide of the invention. For example, serum antibodies to a polypeptide of 
20 the invention in a human population can be purified by methods well known in the art, 
eg, by passing antiserum over a column to which the corresponding selected 
polypeptide or fusion protein is bound. The bound antibodies can then be eluted from 
the column, for example using a buffer with a high salt concentration. 

In addition to the antibodies discussed above, the invention also 
25 contemplates genetically engineered antibodies, antibody derivatives {e.g., single chain 
antibodies, antibody fragments (e.g., Fab, etc.)), according to methods well known in 
the art. 

Other embodiments of the present invention include humanized 
monoclonal antibodies capable of binding to the polypeptides of the invention. The 

30 phrase "humanized antibody" refers to an antibody derived from a non-human antibody 
- typically a mouse monoclonal antibody. Alternatively, a humanized antibody may be 
derived from a chimeric antibody that retains or substantially retains the antigen- 
binding properties of the parental, non-human, antibody but which exhibits diminished 
immunogenicity as compared to the parental antibody when administered to humans. 

35 The phrase "chimeric antibody," as used herein, refers to an antibody containing 
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sequence derived from two different antibodies (see, e.g., U.S. Patent No. 4,816,567) 
which typically originate from different species. Most typically, chimeric antibodies 
comprise human and murine antibody fragments, generally human constant and mouse 
variable regions. 

5 Because humanized antibodies are far less immunogenic in humans than 

the parental mouse monoclonal antibodies, they can be used for the treatment of humans 
with far less risk of anaphylaxis. Thus, these antibodies may be preferred in therapeutic 
applications that involve in vivo administration to a human such as, e.g., use as radiation 
sensitizers for the treatment of neoplastic disease or use in methods to reduce the side 
1 0 effects of, e.g., cancer therapy. 

Humanized antibodies may be achieved by a variety of methods 
including, for example: (1) grafting the non-human complementarity determining 
regions (CDRs) onto a human framework and constant region (a process referred to in 
the art as "humanizing"), or, alternatively, (2) transplanting the entire non-human 
15 variable domains, but "cloaking" them with a human-like surface by replacement of 
surface residues (a process referred to in the art as "veneering"). In the present 
invention, humanized antibodies will include both "humanized" and "veneered" 
antibodies. These methods are disclosed in, e.g., Jones et al.. Nature 527:522-525 
(1986); Morrison et al., Proc. Natl. Acad. ScL. U.S.A., £7:6851-6855 (1984); Morrison 
20 and Oi, Adv. Immunol., 44:65-92 (1988); Verhoeyer et al.. Science 259:1534-1536 
(1988); Padlan, Molec. Immun. 25:489-498 (1991); Padlan, Molec. Immunol. 57 (5): 169- 
217 (1994); and Kettleborough, C.A. et al., Protein Eng. 4(7):773-S3 (1991) each of 
which is incorporated herein by reference. 

The phrase "complementarity determining region" refers to amino acid 
25 sequences which together define the binding affinity and specificity of the natural Fv 
region of a native immunoglobulin binding site. See, e.g., Chothia et al, J. Mol. Biol. 
796:901-917 (1987); Kabat et al., U.S. Dept. of Health and Human Services NIH 
Publication No. 91-3242 (1991). The phrase "constant region" refers to the portion of 
the antibody molecule that confers effector functions. In the present invention, mouse 
30 constant regions are substituted by human constant regions. The constant regions of the 
subject humanized antibodies are derived from human immunoglobulins. The heavy 
chain constant region can be selected from any of the five isotypes: alpha, delta, 
epsilon, gamma or mu. 

One method of humanizing antibodies comprises aligning the non- 
35 human heavy and light chain sequences to human heavy and light chain sequences, 
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selecting and replacing the non-human framework with a human framework based on 
such alignment, molecular modeling to predict the conformation of the humanized 
sequence and comparing to the conformation of the parent antibody. This process is 
followed by repeated back mutation of residues in the CDR region which disturb the 
5 structure of the CDRs until the predicted conformation of the humanized sequence 
model closely approximates the conformation of the non-human CDRs of the parent 
non-human antibody. Such humanized antibodies may be further derivatized to 
facilitate uptake and clearance, e.g., via Ashwell receptors. See, e.g., U.S. Patent Nos. 
5,530,101 and 5,585,089 which patents are incorporated herein by reference. 
10 Humanized antibodies can also be produced using transgenic animals 

that are engineered to contain human immunoglobulin loci. For example, WO 
98/24893 discloses transgenic animals having a human Ig locus wherein the animals do 
not produce functional endogenous immunoglobulins due to the inactivation of 
endogenous heavy and light chain loci. WO 91/10741 also discloses transgenic non- 
15 primate mammalian hosts capable of mounting an immune response to an immunogen, 
wherein the antibodies have primate constant and/or variable regions, and wherein the 
endogenous immunoglobulin-encoding loci are substituted or inactivated. WO 
96/30498 discloses the use of the Cre/Lox system to modify the immunoglobulin locus 
in a mammal, such as to replace all or a portion of the constant or variable region to 
20 form a modified antibody molecule. WO 94/02602 discloses non-human mammalian 
hosts having inactivated endogenous Ig loci and functional human Ig loci. U.S. Patent 
No. 5,939,598 discloses methods of making transgenic mice in which the mice lack 
endogenous heavy claims, and express an exogenous immunoglobulin locus comprising 
one or more xenogeneic constant regions. 
25 Using a transgenic animal described above, an immune response can be 

produced to a selected antigenic molecule, and antibody-producing cells can be 
removed from the animal and used to produce hybridomas that secrete human 
monoclonal antibodies. Immunization protocols, adjuvants, and the like are known in 
the art, and are used in immunization of, for example, a transgenic mouse as described 
30 in WO 96/33735. This publication discloses monoclonal antibodies against a variety of 
antigenic molecules including IL-6, IL-8, TNF , human CD4, L-selectin, gp39, and 
tetanus toxin. The monoclonal antibodies can be tested for the ability to inhibit or 
neutralize the biological activity or physiological effect of the corresponding protein. 
WO 96/33735 discloses that monoclonal antibodies against IL-8, derived from immune 
35 cells of transgenic mice immunized with IL-8, blocked IL-8-induced functions of 
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neutrophils. Human monoclonal antibodies with specificity for the antigen used to 
immunize transgenic animals are also disclosed in WO 96/34096. 

P»1 Y n..rWides or -Array* for Diagnostics 
5 Polynucleotide arrays are created by spotting polynucleotide probes onto 

a substrate (e.g., glass, nitrocellose, etc.) in a two-dimensional matrix or array having 
bound probes. The probes can be bound to the substrate by either covalent bonds or by 
non-specific interactions, such as hydrophobic interactions. Samples of polynucleotides 
can be detectably labeled {e.g., using radioactive or fluorescent labels) and then 
10 hybridized to the probes. Double stranded polynucleotides, comprising the labeled 
sample polynucleotides bound to probe polynucleotides, can be detected once the 
unbound portion of the sample is washed away. Techniques for constructing arrays and 
methods of using these arrays are described in EP 799 897; WO 97/29212; WO 
97/27317- EP 785 280; WO 97/02357; U.S. Patent No. 5,593,839; U.S. Patent No. 
15 5 578,832; EP 728 520; U.S. Patent No. 5,599,695; EP 721 016; U.S. Patent No. 
5 '556 752- WO 95/22058; and U.S. Patent No. 5,631,734. Arrays can be used to, for 
example, examine differential expression of genes and can be used to determine gene 
function. For example, arrays can be used to detect differential expression of a 
polynucleotide between a test cell and control cell {e.g., cancer cells and normal cells). 
20 For example, high expression of a particular message in a cancer cell, which is not 
observed in a corresponding normal cell, can indicate a cancer spec.fic gene product. 
Exemplary uses of arrays are further described in, for example, Pappalarado et al., Sem. 
Radiation Oncol. (1998) 5:217; and Ramsay, Nature Biotechnol (1998) 76:40. 
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Differential F.x pression in Diagnosis 

The polynucleotides of the invention can also be used to detect 
differences in expression levels between two cells, e.g., as a method to identify 
abnormal or diseased tissue in a human. For polynucleotides corresponding to profiles 
of protein families, the choice of tissue can be selected according to the putative 
biological function. In general, the expression of a gene corresponding to a specific 
polynucleotide is compared between a first tissue that is suspected of being diseased 
and a second, normal tissue of the human. The tissue suspected of being abnormal or 
diseased can be derived from a different tissue type of the human, but preferably it is 
derived from the same tissue type; for example an intestinal polyp or other abnormal 
growth should be compared with normal intestinal tissue. The normal tissue can be the 

21 
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m .issue as tha, ofthe ,est sample, or an, norma, ^ 

those tha, express the polynucleotide-related gene of .merest (e.g. b,a,„, thymus, tea ,s 
tmtosl, placenta, spleen, small intestine, skeletal muscle, pancreas, and the 
ItZ rang of the colon). A difference hetween the polynucleonde-related 
5 7°.etn in the two tissoes which are compared, for example ■„ molecuUa 

^ht amino aeid or nucleotide sequence, or relative abundance, ,nd,ca.es a change ,n 
Tgene or a gene which regulates i, in the tissue of the human mat was suspected o 
ZZZLZ Examples of detection of differentia, expression^ and its use tn d,agnos,s 
of cancer are described in U.S. Patent Nus. 5,688,641 and 5,677,125. 
,„ A genetic predisposition ,„ disease in a human can also be detected by 

comparing expression levels of an mRNA or protein corresponding 
of the invention in a fetal tissue with levels assocated m normal fetal ..ssue. F«al 
Its that are used for this purpose include, bu, are no. limited to amntottc flmd 
Ilnic villi, blond, and me blastomere of an in vm-o-fen.l.zed embryo. The 
,5 comparable normal po,ynnc,co,ide.re,,ed gene is obtained 

or protein is obtained from a normal tissue of a human ,n whteh he polynucleotide 
ZL gene is expressed. Differences such as alterations in the nucleofde sequence o 
size of the same product of the fetal polynucleotide-related gene or mRNA o 
ZH 1 -he molecular weigh,, amino acid sequence, or ^relative abundance :o< «* 
20 pro,ein can indicaie a germline mutation in the polynucleotide-related gene of the fetus 
whLTmdTcates a genetic predisposition to disease. In general, diagnosuc, prognose 
^ 1" ds of me invention based on differential expression tnvolve detectmn of 
or amount of a gene product, particularly a differentially expressed ge^e produe, 
„ a test stmtple obtained from a patient suspected of having or be.ng suscep. ble to a 
25 Tiise ( g breast cancer, lung cancer, colon cancer and/or metastatic forms thereoO, 
" Tearing the detected levels to those levels found in 

substantially unaffected by cancer) and/or other control cells (e.g., to d,ffere»t.a.e a 
e« cell from a cell affected by dysplasia). Furthermore, the seventy of the 
d^n be assessed by comparing the detected levels of a different* expressed 
30 gene product with those levels detected in samples representmg the levels of 
d fferential.y gene product associated with varying degrees of seventy of dtsease. , 
!Td be noted that use of the term -diagnostic- herein is no. necessanly mean, .o 
exclude "prognostic" or -prognosis," bu, rather is used as a matter of conveme ,ce. 

The ,erm "differentially expressed gene" is generally .mended to 
3 5 encompass a polynucleotide ,h„ can, for example, include an open reading frame 
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encoding a gene product (e.g., a polypeptide), and/or introns of such genes and adjacent 
5' and 3' non-coding nucleotide sequences involved in the regulation of expression up 
to about 20 kb beyond the coding region, but possibly further in either direction. The 
gene can be introduced into an appropriate vector for extrachromosomal maintenance or 
5 for integration into a host genome. In general, a difference in expression level 
associated with a decrease in expression level of at least about 25V., usually at least 
about 50% to 750/c, more usually at least about 90% or more is indicative of a 
differentially expressed gene of interest, i.e., a gene that is underexpressed or down- 
regulated in the test sample relative to a control sample. Furthermore, a difference in 
10 expression level associated with an increase in expression of at least about 25%, usually 
at least about 50% to 75%, more usually at least about 90% and can be at least about 
1 /.fold, usually at least about 2-fold to about 10-fold, and can be about 100-fold to 
about 1,000-fold increase relative to a control sample is indicative of a differentially 
expressed gene of interest, i.e., an overexpressed or up-regulated gene. 
15 ' "Differentially expressed polynucleotide" as used herein means a nucleic 

acid molecule (RNA or DNA) comprising a sequence that represents a differentially 
expressed gene, e.g., the differentially expressed polynucleotide comprises a sequence 
(eg. an open reading frame encoding a gene product) that uniquely identifies a 
differentially expressed gene so that detection of the differentially expressed 
20 polynucleotide in a sample is correlated with the presence of a differentially expressed 
gene in a sample. "Differentially expressed polynucleotides" is also meant to 
encompass fragments of the disclosed polynucleotides, e.g., fragments retaining 
biological activity, as well as nucleic acids homologous, substantially sim.lar, or 
substantially identical (e.g., having about 90% sequence identity) to the disclosed 

25 polynucleotides. 

"Diagnosis" as used herein generally includes determination of a 
subject's susceptibility to a disease or disorder, determination as to whether a subject is 
presently affected by a disease or disorder, as well as to the prognosis of a subject 
affected by a disease or disorder (e.g., identification of pre-metastatic or metastatic 

30 cancerous states, stages of cancer, or responsiveness of cancer to therapy). The present 
invention particularly encompasses diagnosis of subjects in the context of breast cancer 
(e g. carcinoma in situ (e.g., ductal carcinoma in situ), estrogen receptor (ER)-posit,ve 
breast cancer, ER-negative breast cancer, or other forms and/or stages of breast cancer) 
lung cancer (e.g., small cell carcinoma, non-small cell carcinoma, mesothelioma, and 
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other forms and/or stages of lung cancer), and colon cancer {e.g., adenomatous polyp, 
colorectal carcinoma, and other forms and/or stages of colon cancer). 

"Sample" or "biological sample" as used throughout here are generally 
meant to refer to samples of biological fluids or tissues, particularly samples obtained 
5 from tissues, especially from cells of the type associated with the disease for which . the 
diagnostic application is designed (e.g., ductal adenocarcinoma) and the Idee. 
"Samples" is also meant to encompass derivatives and fractions of such samples (e.g 
cell lysates). Where the sample is solid tissue, the cells of the tissue can be dissociated 

or tissue sections can be analyzed. 
10 Methods of the subject invention useful in diagnosis or prognosis 

typically involve comparison of the abundance of a selected differentially expressed 
gene product in a sample of interest with that of a control to determine any relative 
differences in the expression of the gene product, where the difference can be measured 
qualitatively and/or quantitatively. Quantitation can be accomplished, for example by 
15 comparing the level of expression product detected in the sample with the amounts of 
product present in a standard curve. A comparison can be made visually; by us.ng a 
technique such as densitometry, with or without computerized assistance; by preparing 
a representative library of cDNA clones of mRNA isolated from a test sample, 
sequencing the clones in the library to determine that number of cDNA c ones 
20 corresponding to the same gene product, and analyzing the number of clones 
corresponding to that same gene product re.ative to the number of clones of he same 
gene product in a control sample; or by using an array to detect relative levels of 
hybridi-tion to a selected sequence or set of sequences, and comparing the 
hybridization pattern to that of a control. The differences in expression are then 
25 correlated with the presence or absence of an abnormal expression pattern. A variety ot 
different methods for determining the nucleic acid abundance in a sample are known to 
those of skill in the art (see, e.g., WO 97/27317).In general, diagnostic assays of the 
invention involve detection of a gene product of a the polynucleotide sequence ^ 
mRNA or polypeptide) that corresponds to a sequence of SEQ ID NOs: 1-3351. The 
30 patient from whom the sample is obtained can be apparently healthy, susceptible to 
disease {eg., as determined by family history or exposure to certain environmental 
factors), or can already be identified as having a condition in which altered expression 
of a gene product of the invention is implicated. 

Diagnosis can be determined based on detected gene product expression 
35 levels of a gene product encoded by at least one, preferably at least two or more, at least 
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3 or more, or at least 4 or more of the polynucleotides having a sequence set forth in 
SEQ ID NOs:l-3351, and can involve detection of expression of genes corresponding to 
all of SEQ ID NOs: 1-3351 and/or additional sequences that can serve as additional 
diagnostic markers and/or reference sequences. Where the diagnostic method is 
5 designed to detect the presence or susceptibility of a patient to cancer, the assay 
preferably involves detection of a gene product encoded by a gene corresponding to a 
polynucleotide that is differentially expressed in cancer. Examples of such differentially 
expressed polynucleotides are described in the Examples below. Given the provided 
polynucleotides and information regarding their relative expression levels provided 
10 herein, assays using such polynucleotides and detection of their expression levels in 
diagnosis and prognosis will be readily apparent to the ordinarily skilled artisan. 

Any of a variety of detectable labels can be used in connection with the 
various embodiments of the diagnostic methods of the invention. Suitable detectable 
labels include fluorochromes, (e.g., fluorescein isothiocyanate (FITC), rhodamine, 
15 Texas Red, phycoerythrin, allophycocyanin, 6-carboxyfluorescein (6-FAM), 
2\7'-dimethoxy-4\5'-dichloro-6-carboxyfluorescein, 6-carboxy-X-rhodamine (ROX), 
6-carboxy-2\4\7\4,7-hexachlorofluorescein (HEX), 5-carboxyfluorescein (5-FAM) or 
N,N,N',N'-tetramethyl-6-carboxyrhodamine (TAMRA)), radioactive labels, (e.g., 32 P, 
35 S, 3 H, etc.), and the like. The detectable label can involve a two stage systems (e.g., 
20 biotin-avidin, hapten-anti-hapten antibody, etc.) 

Reagents specific for the polynucleotides and polypeptides of the 
invention, such as antibodies and nucleotide probes, can be supplied in a kit for 
detecting the presence of an expression product in a biological sample. The kit can also 
contain buffers or labeling components, as well as instructions for using the reagents to 
25 detect and quantify expression products in the biological sample. Exemplary 
embodiments of the diagnostic methods of the invention are described below in more 
detail. 

Polypeptide diction in diagnosis . In one embodiment, the test sample 
is assayed for the level of a differentially expressed polypeptide. Diagnosis can be 

30 accomplished using any of a number of methods to determine the absence or presence 
or altered amounts of the differentially expressed polypeptide in the test sample. For 
example, detection can utilize staining of cells or histological sections with labeled 
antibodies, performed in accordance with conventional methods. Cells can be 
permeabilized to stain cytoplasmic molecules. In general, antibodies that specifically 

3 5 bind a differentially expressed polypeptide of the invention are added to a sample, and 
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incubated for a period of rime sufficient to allow binding to the epitope, usually at leas, 
luT 0 lutes. Trie antibody can be detectably labeled 
using mdio.so.opes, enzymes, fluorescets, chemilumineseers, and me 
Id in coniuncrion with a second stage antibody or reagent to detect bmdmg (a.g 
Z „ '^orseradisn peroxidase-conjugamd avidin , secondary 
to a fluorescent compound, e.g., fluorescein, rhodamtne, Texas red *.). 
I presence of antibody binding can be determined by various methods, mclud ng flow 
of dissociL A microscopy, radiography, scin.iUa.ion counting, ci , 
— e methods can of qualrtative or quantise deteCon o leve or 
amounts of differentially expressed polypeptide can he used, for example ELISA, 
western blot, immunoprecipitation, radioimmunoassay, elc. 

mRNA detection. The dtagnostic methods of the invenrion can also or 
alternatively involve detection of mRNA encoded by a gene corresponding to a 
r rjy expressed polynucleotides of the invention. Any anintb, ^-J 
nuantitative methods known in the art for deteefng specfic mRN As can used 
m * detected by, for example, , hybridan * * 
reverse transcriptase-PCR, or in Northern b,o K contatmng poly A + mR^A One o 
skill in the art can readily use these methods to determme differences ,n rite s re o, 
* f mRNA nanscripts between two samples. mRNA expression levels m a 

^ mm rite sample, where the EST ribrary is representative »f — « 
he sample (Adams, et a.., (1991) Science 252:1651). Enumerate of to * 
reprejation of ESTs within the library can be used to approx,mare the elattve 
Presentation of the gene transcript within the starring sample. The results ot ES 
; ZZ Z a test sampl can then be compared to EST analysis of a reference sample to 
"re™ ne the relative expression levels of a selected pplynucleottde, pamcularly a 
ZZZL correspondmg to one or more of the differentially expressed . ^ 

ninSed herein. Alternatively, ^.^J^^T-^t 
using serial analysis of gene expression (SAGE) methodology re g 
0 sSL (.995, 270:484) or diffemnrial display (DD) methodology (see, e.g., U.S. 

Patent NOs. 5,776,683 and 5,807,680). 

Alternatively, gene expression can he analyzed ns.ng hyb„d,z.no„ 
analysis Oligonucleotides or cDNA can be used to selectively identity or capmre DNA 
Xa o^cric sequence composition, and the amount of RNA or cDNA hyhndtzed 
,5 oTlwn capture sequence de.erm.ned qualitatively or quantitatively, to p,ov,de 
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informal about the Native repression of a particular message w.,h.» foe po* o 
oellular messages m a sample. Hybridization analysis can be desrgned <o aHovv for 
o current screening of foe relative expression of hundreds ,o thousands of ger.es by 
: „ g , for example, array-based technolog.es having bigb density formats mcfinhng 
filte* microscope slides, or microchips, or solution-based technolog.es tha, use 
inters, m.c v spirometry). One exemplary use of arrays in foe 

spectroscopic analysis (e.g., mass spectrometry.). v 
diagnostic methods of the invention is described below in more delarl 

S ^i^^^S^^^^- The d,a6 " OS,,<; T 

of foe invention can focus on foe expression of a single differentially expressed gene^ 
Fo exlple the diagnosric mefood can involve delecting a differentrally expressed 
leLTa tlymorphL of such a gene (e.g., a polymorphism in an cod.ngreg.on or 

n» regl , thaTis assoc.ated with disease. Disease-associated polymorphisms « 
"etde deletion or truncation of foe gene, mutations tha, alter express.on level and/or 

affect activity of the encoded protein, etc. 

A number of methods are available for analyzmg nucleic adds fo the 
f cPPific seauence e s a disease associated polymorphism. Where large 
— of DNA genomic DNA is used directly. Alternatively, the 

f Vnteres, is cloned inlo a suitable veelo, and grown in sufficient quan.,.y fo 
TnTysis Cells that express a differentially expressed gene can be used as a sour e of 
^A which can be 'assayed directly o, reverse transcribed in,o cDN, ; for 
The nucleic acid can be amplified by conventional techmques, such as h polym r* 
eha,n reaction (PCR), to provide snffic.en, amounts for analysts and a ^ 
ca „ be included in the amplification reaction (e.g., using a de.ec.ably labeled pnmer 0 
d«ec ably labeled ol.gonucleo.ides, to facilitate detection. Allernabvely, venous 
mtfoodsle also hnown in foe ar, that utilize ofigonucleotide ligabon as a „f 
detecting polymorphisms, see e.g., Riley et al., Nuol Ac.ds Res. (.990) ro.2887, and 
Delahuntyetal.,/lm.^H»m.GeM.(1996)5S:1239. 

The amplified or cloned sample nucleic ac.d can be analyzed by one of a 
number of mefoods known in the art. The nucleic acid can be sequenced by dideox, ■ o, 
3 other methods, and the sequence of bases compared to a selected sequence, e.g., to a 
wnd-type sequence. Hybridization with the polymorphic or variant sequence can a,» 
be used to determine its presence in , sample (e.g., by Soufoem bio, do, blob etc.). The 
ryWdization pattern of a polymorphic o, varian, sequence and a control sequence to an 
a L of oligonucleotide probes immobilized on a solid support, as descnbed ,„ U.S. 
5 Ten, No. 5,445,934, or in WO 95/35505, can also be used as a means of idenlKymg 
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polymorphic or variant sequences associated with disease. Single strand 
onfornLnal polymorphism (SSCP) analysis, denaturing gradient gel e— ^is 
(DGGE), and heteroduplex analysis in gel matrices are used to detect conformaUonal 
changes created by DNA sequence variation as alterations in electrophoretic mobihty^ 
5 Alternatively, where a polymorphism creates or destroys a ""^"^ 
restriction endonuclease, the sample is digested with that endonuclease, and the 
products size fractionated to determine whether the fragment was 
Fractionation is performed by gel or capillary electrophoresis, particularly acrylarmde or 

agarose gel, ^ ^ & ^ ^ ^ based on the functional or 

antigenic characteristics of the protein. Protein truncation assays are useful in detecting 
Itions that can affect the biological activity of the protein. Various im_,s 
designed to detect polymorphisms in proteins can be used in screening. Where many 
dWe se genetic mutations lead to a particular disease phenotype, funeral protein 
15 proven to be effective screening tools. The activity of the encoded protein 

can be determined by comparison with the wild-type protem. 

mmunj ^^ In another embodunent, the 

diagnostic and/or prognostic methods of the invention involve detection of expression 
of a selected set of genes in a test sample to produce a test expression 
20 The TEP is compared to a reference expression pattern (REP), which ,s generated by 
detection of expression of the selected set of genes in a reference ^ 
positive or negative control sample). The selected set of genes includes at least one of 
he genes of I Mention, which genes correspond to the polynucleotide sequences of 
SEQ ID NOs:l-3351. Of particular interest is a selected set of genes that includes genes 
25 differentially expressed in the disease for which the test sample is to be screened- 

"Reference sequences" or "reference polynucleotides" as used herein in 
the context of differential gene expression analysis and diagnosis/prognosis refers to a 
selected set of polynucleotides, which selected set includes at least one or -ore of the 
differentially expressed polynucleotides described herein. A plurality of reference 
30 sequences, preferably comprising positive and negative control fences can be 
included as reference sequences. Additional suitable reference sequences are found in 
Genbank, Unigene, and other nucleotide sequence databases (including, e.g., expressed 
sequence tag (EST), partial, and full-length sequences). 

"Reference array" means an array having reference sequences for use in 
35 hybridization with a sample, where the reference sequences include all, at least one of, 
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or any subset of the different.* expressed polynucleotides 

such an array will inelude a. leas. 3 differen. referenee sequences, and can nclude any 
™ oTaTo toe provided differentially expressed sequences. Arrays of in,eres, can 
™X sequences, include polymorphisms, of omer genetic sequences, 
5 pIJ, o*e, sequences of in.eres. for screening for a disease or disorder 

tllasia or other relaled or unrelated diseases, disorders, or condmons). The 
:ZucZ : — on 0,0 array .11 usCly be a, leas. abou, ,2 n, in Ionian 
an be of abou. ,he length of .he provided sequences, or can extend 
regions to generate fragments of .00 n, .0 200 n. in lengm or morej efe™ a™ s 
,„ J, ta produced according to any suitable methods known .» *e arn 

methods of producing large arrays of oligonucleotides are described '» ^ P »™ 
5 134 854 tmd 5 445,934 using light-directed synthesis techmques. Using a computer 
orltuX m, a heterogeneous array of monomers is convened, through 
—/coupling a. a nnmber of reaction sites into a 
,5 polymers. Alternatively, microarrays are generated by deposition of 

oligonucleotides onto a solid substra.=, for example as descnbed in PCT pubhshed 

application no. WO 95/35505. „„ ,. n pp.. ™ use d herein refers to the 

A "reference expression pattern or REP as useo nerei 

■ „f » fleeted set of genes, particularly of differentially 
relative leve s of expression of a selecteo se. oi gciica, v 

relative le v d ce „ ^ e g „ a normal cell, a 

20 expressed genes, thai is associarea w. , 
cancerous cel., a cell exposed to an environmental stimulus, and the ke A 
expression pattern" or "TEP" refers to relative levels of expression o a selected s of 
genes, particularly of differentially expressed genes, in a test sample (e.g., a eel, 

unknown or suspected disease state, from which mRNA is isolated). 

unknown ^ ^ ^ _ ^ rf ^ , 0 meth ds we „ 

known in tbe art. For example, REPs can be generated by hybridizing a 
LTly having a selected se. of polynucleotides (particularly a selected se, of 
«ln«y expressed polynucleotides,, acquiHng .he hybridization a» fromdK 
array and storing the data in a forma, tha, allows for ready comparison of the REP wttb 
30 ^ Alternatively, all expressed sequences in a control sample can he .sola,od and 
sequenced e g , by isolating mRNA torn a control sample, converting me mRNA into 
cDNA I iquencng me cDNA. The resulting sequence information roughly or 
pZe.y muTts me idLity and relative number of expressed sequences in the samp n 
precisely j • format fee a computer-readable 

T „e sequence information can then be stored * £ > REP can he 
format) that allows for ready comparison of the RbP wun a 
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normaHzed prior .o or ^ darn s-orage, and/or - * « «*« 

sequences of expressed genes to, are of less intoes, or that mtgh, comp„ca,e analysis 
(eg, some or a!! of to sequences associated with houselceeptng genes can be 
eHn.to.ed ^ ^ ^ ^ ^ ^ . by hybridjzi 

. test sample to an array having a selected set of polynucleotides, particularly a seated 

set fdXntiaUy expressed f^^^^^^T"^^ 
amy, and storing to dam in a forma, to, allows for ready ccnpansor t of to TET w«h 
T^? The REP and TEP to be used in a comparison can be generated stmultoeously, 
0 o, to TEP can be compared ,0 previously generated and stored REPs^ 

In one embodiment of to invention, comparison of a TEP wth a REP 
involves hybridizing a ,es, sample wi«h a reference array, where to reference arrayjtaa 
one or more reference sequences for use in hybridization w„h a sample The reference 
Zles include all, a, .east one of, o, any subset of to <«*^ 
,5 poUcleoudes described herein. Hybridization data for to tes, samples acqmre the 
fal normalized, and to produced TEP compamd with a REP generated usmg an army 
halg to same or simtlar selected set of dflTerentially expressed polynucleofd , 
pZ to, correspond to sequences differentia,* expressed between to ,wo sam 
wi ll show decreased or increased hybridizadon efficency for one of to samples 

M '° ^ tLs for collection of data from hybridan of samples with a 

reference arrays am well knowu in to art For example, to polynueleondes of to 
"e and .es, samples can be genera,.d using a detectable fluorescent label, and 
Hybridization of to polynucleotides in to samp,es detected by scanntng to 
25 microlys for to presence of to detectable label using, for example, a rmcroscope 
ource ,o, directing light a, a substrate. A photon counter detects fluorescent 
L, he subsume, while an x-y translation s,»g« varies to location of to substrate^ A 
confoca. detection device that can be used in to subject methods ,s desenbed „ , 
Paten, No. 5,631,734. A scanning laser microscope ,s desenbed n Shalon , « , a 
30 SL. Res. (.996) 6:639. A scan, using to approprito exciation l,n«, ,s performed 
to each fluoropbore used. The digital images generated from to scan « ^ ton 
Lmbined for subsequent analysis. Eor any particular army element, to ^ to 
fluorescent signal from one sample <e.g., a test sample) ,s compared to the fluorescent 
X* from another sample (e.g., a reference sample), and to relattve stgnal ,n,ens,ry 
35 determined. 
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Methods for analyzing the data collected ftom hybridization to arrays are 
well known in the art. For example, where detection of hybridization involves a 
fluorescent label, data analysts can include the steps of determining fluorescent .nter^y 
as a function of substrate position from the data collected, removmg outhers, ,e data 
5 Liating from a predetermined statistical distribution, and calcnlattng the relauve 
bmlng arTtnit, of the targets from the remaining data. The resulting data can be 
delayed as an image with the intensity in each region varying accordmg to the btndmg 

affinity between targets and probes. 

in general, the test sample is classified as having a gene expression 
10 profile corresponding to that associated with a disease or non-disease state by 
clearing the TEP generated from the test samp.e to one or more REPs generated from 
Zee samples I,, from samples associated with cancer or ^ 
cancer, dysplasia, samples affected by a disease other than 

etc ) The criteria for a match or a substantial match between a TEP and a REP include 
15 expression of the same or substantially the same set of reference genes, as wel. as 
expression of these reference genes at substantially the same levels (,g„ 
Xence between the samples for a signal associated with a ^ ~ 
sequence after normalization of the samples, or at least no greater than about 2 5/o to 
about 40% difference in signal strength for a given reference sequence. In general, a 
20 pattern match between a TEP and a REP includes a match in expression, preferably a 
Lriin qualitative or quantitative expression level, of at least one of, a„ or any subset 
of the differentially expressed genes of the invention. 

Pattern matching can be performed manually, or can be performed using 
a computer program. Methods for preparation of substrate matrices (e g arrays), 
25 de gn of oligonucleotides for use with such matrices, labeling of probes, hybnd.zaUon 
ditions, scanning of hybridized matrices, and analysis of patterns genera, , 
including comparison analysis, are described in, for example, U.S. Patent No. 
5,800,992. 

nia anosis. Prop ^is and Management of Cancer 
30 The polynucle^Tt^lnvention and their gene products are of 

particular interest as genetic or biochemical markers (,g„ in blood or tissues) that will 
detect the earliest changes along the carcinogenesis pathway and/or to monitor the 
efficacy of various therapies and preventive interventions. For example, the level of 
expression of certain polynucleotides can be indicative of a poorer prognosis, and 
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.before warrant more aggressive chemo- or radio-therapy for a pauen. or v,ce versa 
Th cllatton of novel surrogate tumor specific features with response to treatmen 
Ld cT me in patient, cnn define prognostic indicators mat allow me des,gn of 
^tlrapy »-d on the molecular profile of the tumor. These therap.es mc ud 
5 : targlg and gene therapy. Determining expression of 

and comparison of a patients profile with known expression m normal ttssue and 
tiZof me disease allows a determination of the best possible trearmen, for 
patient, bom in terms of specificity of treatment and in terms of comfort level of *e 
Lien. Surrogate tumor markers, such as polynucleotide expression, can also be used 
0 bener ZL and thus diagnose and treat, different forms and d.sease sfc.es of 
ZTt^L^ wide.y used in oncology that can benefit from .denhficafion 
7Z expression levds of the polynucleotides of me invention are stagmg of the 
cancerous disorder, and grading the nature of me cancerous bssue^ 

The polynucleotides of the invention can be useful to momtor pabents 
15 having or susceptible to cancer to detect po.en<ia>ly maiignant events a, a molecuiar 
T , be ore they are detectable a. a gross morphological level. Furthermore a 
poTynucllde of the invention identified as important for one type of cancer can also 
CeChcations for development o, risk of development of other types of „g 
where a , polynucleotide is differentially expressed across var,ous cancer types. Thu 
20 2 example expression of a polynucleotide tha, has chnical irnphcauons fo, rnetastanc 
InTancer can also have clinical impltcations for stomach cancer or endometrial 

SMI* paging is a process used by physicians to describe how 
advanced the cancerous state is in a patient. Generally, if a cancer is ™'J 
25 the area of the primary lesion without having spread to arry « 

S«e I If it has spread only to the closest lymph nodes, tt ,s called Stage II. In Stage 
ra me 1 cer has generaHy spread to the lymph nodes in near proxumty m the sue of 
2 lesion Cancers tha, have spread to a distant part of the body, such as the 

liver, bone, brain or other site, are Stage IV, the most advanced stage 
,„ The polynucleotides, of the invention can facl.tate fine-tunmg of the 

staging process by identifying markers for the aggresivity of a cancer, e.g the 
I JLic potential, as we,, as the presence in different areas of the body. Thus, a Stage 
„ cancer with a polynucleotide srgnilying a high metastatic potennal cancer can be used 
e a border i ne Stage ., tumor ,o a Stage „, tumor, justifying more aggresswe 
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therapy. Conversely, the presence of a polynucleotide signifying a lower metastatic 
potential allows more conservative staging of a tumor. 

Gradin g of cancers . Grade is a term used to descnbe how closely a 
tumor resembles normal tissue of its same type. The microscopic appearance of a tumor 
5 is used to identify tumor grade based on parameters such as cell morphology, cellula, 
organization, and other markers of differentiation. As a general rule, the grade of a 
tumor corresponds to its rate of growth or aggressiveness, with undifferentiated or high- 
grade tumors being more aggressive than well differentiated or 

following guidelines are generally used for grading tumors: 1) GX Grade cannot be 
l0 ssessed 2) Gl Well differentiated; G2 Moderately well differentiated; 3) 03 Poorly 
d fferentlated; 4) G4 Undifferentiated. The polynucleotides of the invention can e 
especially valuable in determining the grade of the tumor, as they not only can .d m 
determining the differentiation status of the cells of a tumor they can also i^fy 
factors other than differentiation that are valuable in determining the aggressivity of a 
15 tumor, such as metastatic potential. 

nation of lun g cance r. The polynucleotides of the invention can be 
used to detect lung cancer in a subject. Although there are more than a dozen different 
kinds of lung cancer, the two main types of lung cancer are small cell and 
which encompass about 90% of all lung cancer case, Small cell carcmoma (also call d 
20 I cell carcinoma) usually starts in one of the larger bronchial tubes grows fairly 
Tpidly, and is likely to be large by the time of diagnosis. Nonsmall eel, lung cancer 
(NSCLC) is made up of three general subtypes of lung cancer. Epidermoid carcinoma 
Zso called squamous cell carcinoma) usually starts in one of the larger bronchial tube 
and grows relatively slowly. The size of these tumors can range from very small o 
25 quite large. Adenocarcinoma starts growing near the outside surface of the lung and can 
vary in both size and growth rate. Some slowly growing adenocarcinomas are described 
asleolar cel. cancer. Large cell carcnoma starts near the surface of the lung, grows 
rapidly, and the growth is usually fairly large when diagnosed. Other less common 
forms of lung cancer are carcinoid, cylindroma, mucoepidermoid, and malignant 

30 mesothelioma.^ ^ ^ polynucleotides differentially 

expressed in normal cells versus cancerous lung cells (e.g., tumor cells of high or low 
metastatic potential) or between types of cancerous lung cells (..*. high metaslauc 
versus low metastatic), can be used to distinguish types of lung cancer as well as 
identifying traits specific to a certain patient's cancer and selecting an appropriate 
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therapy. Fo, example, if the Pint's biopsy expresses a polynucleotide that ts 
associated wi,h a iow metastatic potential, it may justify leaving a large, ponton of *e 
patient's lung in surgery to remove the lesion. Alternatively, a smaller eston wttt 
xpression of a polynucleotide that is associated with high mettstattc potent*, m y 
justify a more radical removal of lung tissue and/or the surmundmg lymph nodes, even 
if no metastasis can be identified through pathological exam.natton. 

n^A^LM^m^- The majority of breast cancers are 
adenocarcinomas subtypes, which can be summarized - follows: 1) duett! carcinorm, 
/„ sUu (DC1S), including comedocarcinoma; 2) infiltratmg (or mvastve) duett 
carcinoma (IDC); 3) lobular carcinoma * *. (LCIS); 4> infitaating (or mvas.ve) 
II, carcinoma (1LC); 5) inflammatory breast cancer; 6)»*ltay carcnoma 
7) mucinous carcinoma; 8) Pagefs disease of the nipple; 9)Phy,lodes tumor; and 

10) tubular carcinoma. 

The expression of polynucleotides of the invention can be used in the 
diagnosis and management of breast cancer, as well as to distinguish between types of 
breast cancer. Detection of breast cancer can be determined using expression levels of 
any of the appropriate polynucleotides of the invention, either alone or in comb.nat.on. 
^termination of the aggressive nature and/or the metastatic potential of a breas cancer 
can also be determined by comparing levels of one or more polynucleotides of the 
invention and comparing levels of another sequence known to vary m cancerous Ussue 
eg ER expression. In addition, development of breast cancer can be detected by 
examining the ratio of expression of a differentially expressed polynucleoude to the 
,evels of steroid hormones «,.*, testosterone or estrogen) or to other hormones (e.g., 
gro wth hormone, insuhn). Thus expression of specific marker polynuc 
used to discriminate between normal and cancerous breast tissue, to d.scnmmate 
between breast cancers with different cells of origin, to discriminate between breast 
cancers with different potential metastatic rates, etc. 

Ds^on^oLcolon^mcer. The Polynucleotides of the mvent.on 
exhibiting the appropriate expression pattern can be used to detect colon cancer m a 
) subject. Colorectal cancer is one of the most common neoplasms in humans and 
perhaps the most frequent form of hereditary neoplasia. Prevention and early detecUon 
are key factors in controlling and curing colorectal cancer. Colorectal cancer begms as 
polyps, which are small, benign growths of ce.ls that form on the inner hmng of the 
colon Over a period of several years, some of these polyps accumulate agonal 
5 mutations and become cancerous. Multiple familial colorectal cancer disorders have 
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been identified, which are summarized as follows: 1) Familial adenomatous polyposis 
(FAP); 2) Gardner's syndrome; 3) Hereditary nonpolyposis colon cancer (HNPCC); and 
4) Familial colorectal cancer in Ashkenazi Jews. The expression of appropriate 
polynucleotides of the invention can be used in the diagnosis, prognosis and 
5 management of colorectal cancer. Detection of colon cancer can be determined using 
expression levels of any of these sequences alone or in combination with the levels of 
expression. Determination of the aggressive nature and/or the metastatic potential of a 
colon cancer can be determined by comparing levels of one or more polynucleotides of 
the invention and comparing total levels of another sequence known to vary in 
10 cancerous tissue, e.g., expression of P 53, DCC ras, lor FAP (see, e.g., Fearon ER, et al., 
Cell (1990) 67(5):759; Hamilton SR et al., Cancer (1993) 72:957; Bodmer W, et al., 
Nat Genet. (1994) ¥(J):217; Fearon ER, Ann N Y Acad Sci. (1995) 765:101). For 
example, development of colon cancer can be detected by examining the ratio of any of 
the polynucleotides of the invention to the levels of oncogenes (e.g., ras) or tumor 
15 suppressor genes (e.g., FAP or P 53). Thus expression of specific marker 
polynucleotides can be used to discriminate between normal and cancerous colon tissue, 
to discriminate between colon cancers with different cells of origin, to discriminate 
between colon cancers with different potential metastatic rates, etc. 

T Tgp of Polynucleotide* to Screen for Peptide Analogs and Antagonists 
20 Polypeptides encoded by the instant polynucleotides and corresponding 

full length genes can be used to screen peptide libraries to identify binding partners, 
such as receptors, from among the encoded polypeptides. Peptide libraries can be 
synthesized according to methods known in the art (see, e.g., U.S. Patent No. 5,010,175, 
and WO 91/17823). Agonists or antagonists of the polypeptides if the invention can be 
25 screened using any available method known in the art, such as signal transduction, 
antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The 
assay conditions ideally should resemble the conditions under which the native actmty 
is exhibited in vivo, that is, under physiologic P H, temperature, and ionic strength. 
Suitable agonists or antagonists will exhibit strong inhibition or enhancement of the 
30 native activity at concentrations that do not cause toxic side effects in the subject. 
Agonists or antagonists that compete for binding to the native polypeptide can require 
concentrations equal to or greater than the native concentration, while inhibitors capable 
of binding irreversibly to the polypeptide can be added in concentrations on the order of 
the native concentration. 
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Such screening and experimentation can lead to identified 

"in cell that possess the receptor as a result of genetic engineenng Further, f the 
„ ^ leptor shares biologically important clwacteristics »,«, a ^°™^ 
Irmation about agonist/antagonist binding oar, facihtate development of improved 
agonists/antagonists of the known receptor. 

10 Ptvtnwr-mical Com positions and TherapeM sJte 

Pharmaceutical compositions of the invention r 

ca. y effective amount" as used herein refers to an amount of a therapeutic 
15 ~ ea,,—, or ^ ^Z^TZl^X 

2 5 0 05 mg*g .0 about ,0 mg/hg of the DNA constructs in the individuai ,0 which „ 

admi " iSKred A pharmaceutics composition can also contain a pharmaceutic* 
rentable carrier The term ••pharmaceutically acceptable carrier" refers .0 a earner for 
SS' . -mpeuhe agent, such as antibodies or a 

♦ Tho term refers to any pharmaceutical carrier that does not 

30 z tz^r: — » *• — 1 T ms ,he 

cTmporihor and which can be administered without uodne toxicity. Suitable carrre* 
caX L slowly metabolized macromolecules such as proteins, polysace arides, 
pCcril-s, pJyglycolic acids, polymeric amino acids, amino acid copolymers. 
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«, inKtiv e virus particles. Such carriers arc well known * .hose of ordinary sk, I in 
Z r Pharmaceutic* acceptable earners in therapeutic composes can in cud^ 
t£ such as water, sahne, glycerol and ethanol. Auxiliary substances such a, 
Z Z - emulsilying agents, P H buffering —a, and the like, - 
in such vehicles Typically, the therapeutic compositions are prepared as mjectab.es, 
e ^ as C soilns or suspensions; solid forms suhahle for solutron ,n, o 
Cln I, ,Md vehicles prior ,0 inaction can also he 

Juded within me definition of a pharmaceufcally acceptable earner 
pdutllly acceptable salts can also he present in the pharrnaceufica, 
Pharmaceutically P hydroc hlorides, hydrobromides, 

composition, e.g., mineral actd *ta such y ^ 

phosphates, sulfates, and the like, and the salts of g ^ ^ 

~ <M ^^^r~,a,ed, the compositions of the invention 
can be (1) admimsfcred duectly to the subject (e.g., as polynucleotide or ^ 
ivered « vivo, to cells derived from fire subject (e.g., as m c v,vo gene 
em y, D^c delivery of the compositions wil, generally be accomplished by 
^ injection, e.g., subcutaneous*, —^sZ 
Lamuscul arly, — or to the —1 ^^JS^"- 

be a single dose schedule or a multiple dose schedule. 

Methods for the ex vivo delivery and reimplantation of transformed cells 
5 into a subject are known in the ar, and described in e.g., International Pubhcafion No. 
, mto a suojec, ar applications include, for example, 

WO 93/14778. Examples of cells useful in ex vivo w j„ Hri ti r „lls or 

stem cells particularly hematopoetic, lymph cells, macrophages, dendrific cells, or 
K Generally delivery of nucleic acids for both ex vivo and m «,<ro 

:Z^s crrlmplishTd by, for example, dextran-mediated transition, 
0 Li m plsXte prestation, polybrene mediated transfection, protoplas, : ta£ 

e^ropomtion, encapsulation of the polynucleotides, in liposomes, and drrec. 

microiniectionoftheDNA into nuclei, all well known in the art. 

microinjection ^ coiiesponding to . polynu<:to .ide of the invention has been 
found to correlate with a proliferative disorder, such as neoplasia, 
hyperplasia, the disorder can be amenable to treatment by administration of 
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therapeutic agent based on the provided polynucleotide, corresponding polypeptide or 
other corresponding molecnle (e.g., antisense, ritozyme, «.). 

The dose and the means of administration of the inventive 
pharmaceutic* compositions are determined baaed on the specific qualities of the 
Lrapeutic composition, the condition, age, and weigh, of the potion, 
„f ^ disease, and other relevant factors. For example, adnuntstra ton o 
polynucleotide therapeutic compositions agents of the invention tncludes , fcxd or 
systemic admints.ra.ion, including injection, oral admm,s.ra.,on, parucle gun or 
ILerfzed administiation, and topica, administration. Preferably, ,he thempeu c 
, polynucleotide composition contains an expression construe, comprtsmg , promoter 
Tperably linked to a polynucleotide of a. leas. 12, 22, 25, 30, or 35 conhguous n of fhe 
polynucleotide disclosed herein. Various melhods can bu used .o *J 
LLpeutic composition directly .0 a specific site in fire body. For examp. , a small 
metLtic lesion is ,oca,ed and fine merapeudc composition injee,ed severa tin.es ,n 
5 "1, differen. locations wi.hin .be body of tumor. Alternatively, artenes winch sen, 
a tumor are identified, and the therapeutic composition injected in«o such an artery ,m 
order ,o deliver tire composition d.rectly into fire tirmor. A tumor tha, has a nee one 
center is aspirated and .he composition injeced direcly in,o the now empt center of 
dte .urnor. The antisense composition is directly administered to the surface of .he 
» tumor, for example, by top.ca, application of the composition. X-,ay tmagtng rs used ,o 
assist in certain of the above delivery methods. 

Receptor-mediated targeted delivery of therapeufe eompostuons 
containing an antisense polynucleotide, subgenomie polynucleotides, or annbodres to 
Iffic Lue, can also be used. Receptor-mediated DNA delivery techmques ate 
25 ™ in, for example, Findeis c, a,., Ws (1993) f ,202; Chtou .« £ 

ed ) (.994); Wu et a,., J. Biol- Chen,. (1988) 262:621; Wu e, al., J. B,ol. Chem. 1994) 
269 542. Zenke e, a,., Pro, Na„. Acad SC. (USA) (1990) .7*655; Wu e, al., J. B,o,. 
Chen, (1991) 266;338. Therapeutic compositions containing a polynucleo.tde are 
30 a^islred'in a range of abou. ,00 ng ,0 about 200 mg »f ™ « 
administration in a gene therapy protocol. Concentration ranges of 
abon. 50 mg, abou. 1 mg .o abou, 2 mg, abou, 5 mg .0 abou, 500 mg, and about 20 mg 
'about .0 mg of DNA can also be used during a gene therapy protocol. Factors such 
as method of action (e.g., for enhancing or inhibiting levels of tine 
35 product) and efficacy of transformation and expression are consrderattons winch w,ll 



WO 01/02568 



PCT/TJS00/18374 



affect the dosage required for ultimate efficacy of the antisense subgenera 
polynucleotides. Where greater expression is desired over a larger area of tissue, larger 
amounts of antisense subgenomic polynucleotides or the same amounts -admm^-d 
in a successive protocol of administrations, or several admm.strations to ^fferen 
adjacent or close tissue portions of, for example, a tumor site, may be required to rffe* 
a ositive therapeutic outcome. In all cases, routine experimentation » ^ 
wnl determine specific ranges for optimal therapeutic effect. For po.ynucleond. e late d 
genes encoding polypeptides or proteins with anti-inflammatory activity, suitable use, 
doses, and administration are described in U.S. Patent No. 5,654,173. 
, The therapeutic polynucleotides and polypeptides of the present 

invention can be delivered using gene delivery vehicle, The gene delivery ^vehicle - 
be of vind or non-viral origin (see generally, Jolly, Cancer Gene Therapy^ ) L5 
Kimura, Human Gene Therapy (1994) *845; Connelly, Human Gene Therapy (1995) 
7-185- and Kaplitt, Nature Genetics (1994) 6:148). Expression of such codmg 
5 sequences can be induced using endogenous mammalian or heterologous promoters. 
Expression of the coding sequence can be either constitutive or regulated. 

Viral-based vectors for delivery of a desired polynucleotide and 
expression in a desired cell are well known in the art. Exemplary 
include, but are not limited to, recombinant retroviruses, (see, e, WO 90/07936 WO 
•0 94/03622- WO 93/25698; WO 93/25234; U.S. Patent No. 5, 219,740; WO 93/1 1230 
WO 93/10218; U.S. Patent No. 4,777,127; GB Patent No. 2,200,651; EP 0 345 242; an 
WO 91/02805), alphavirus-based vectors (e.g., Sindbis 

virus (ATCC VR-67; ATCC VR-1247), Ross River virus ^ ^O^ATO ^ 
1246) and Venezuelan equine encephalitis virus (ATCC VR-923; ATCC VR-1250, 
25 ATCC VR 1249; ATCC VR-532), and adeno-associated virus (AAV) vectors (see ,g 
WO 94/12649, WO 93/03769; WO 93/19191; WO 94/28938; WO 95/11984 and WO 
95/00655). Administration of DNA linked to killed adenovirus as described m Cunel, 
Hum. Gene Ther. (1992)5:147 can also be employed. 

Non-viral delivery vehicles and methods can also be employed, 
30 including, but not limited to, polycationic condensed DNA linked or unlinked to killed 
adenoviL alone (see, e. g , Curiel, Hum. Gene The, (1992) 3:147)- hgand-1^ 
DNA(see eg Wu, J. Biol. Chem. 2^:16985 (1989)); eukaryotic cell delivery vehicles 
ZTZ e g , U.S. Patent No. 5,814,482; WO 95/07994; WO 96/17072; 
WO 95/30763; and WO 97/42338) and nucleic charge neutralization or fusion with cell 
35 membranes. Naked DNA can also be employed. Exemplary naked DNA introduce 



WO 01/02568 



PCTAJS00/18374 



10 



15 



m e,h„ds are described in WO 90,11092 and U.S. Pa,«n,No. 5,580,859. M£«-~ 
can ac, aa gene deliver vehielea are described in U.S. Paien. No. 5,422,120 WO 
95/13796- WO 94/23697; WO 91/14445; and EP 0524968. Additional approaches are 
described' in Phiiip, Mol. Cell Biol. 74:241, (1994), aod in Woffendin, Pro, No,,. 

Acad Sci (1994)97:1581. 

Further non-viral delivery suitable for use includes mechanical dehvery 
systems such as the approach described in Woffendin et a.., Proc. Natl. Acaa. ScUJSA 
97(24) 1 1581 (1994). Moreover, the coding sequence and the product of expression of 
su h an be delivered through deposition of ^^^^ 
use of ionizing radiation (see, , 8 ., U.S. Patent No. 5,206,152 and WO 92/1 03 £ 
Other conventional methods for gene delivery that can be used for dehvery of the 
coding sequence include, for example, use of hand-held gene transfer parUcle gun see 
eg . US Patent No. 5,149,655); use of ionizing radiation for activatmg transferred gene 

(see' e g., U.S. Patent No. 5,206,152 and WO 92/1 1033). 

The present invention will now be illustrated by reference to the 

following examples which set forth particularly advantageous embodiments. However 

it should be noted that these embodiments are illustrative and are not to be construed as 

restricting the invention in any way. 
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FXAMPLES 



EXAMPLE 1 

Source of Biological Mater.als and Overv.ew of Novel Polynucleotides 
Expressed by the Biological Materials 

" Cell lines and human normal and tumor tissue were used to construct 

cDNA libraries from mRNA isolated from the cells and tissues. Most sequences were 
cuina nor Kml2L4-A cell line, a 

about 275-300 nucleotides m length. The cells lines inuu 

ii u„„ /-MoriWa W A K et al., Cancer Research (1988) 
high metastatic colon cancer cell line (Monka, W. A. K. ei a , 

,0 6863). The KM12L4-A cell line is derived from the KM12C cell line. The KM12C 
11 line which is poorly metastatic (low metastatic) was established m c ^or^ 
Dukes' stage B2 surgical specimen (Morikawa et al. Cancer Res. (1988) 48M6 3). The 
KML4-A fs a highly metastatic subline derived from KM12C (Yeatman et a.^. 
Acids. Res. (1995) 23:4007; Bao-Ling et al. Proc. Annu. Mee, Am. Assoc. Canc^ *e , 
15 (1995) 27-3269). The KM12C and KM12C-derived cell lines {e.g., KM12L4 
2 4 -A, J) are we„«ed in the art as model ce.l lines for *e -dy of 
colon cancer (see, e.g., Moriakawa et al., supra; Radinsky et al. CUn 
(1995) 1 19; Yeatman et al., (1995) supra; Yeatman et al., Clin. Exp Metas^s (1996) 
14-246) These and other cell lines and tissue are descnbed in Table 6 
90 The sequences of the isolated polynucleotides were first masked to 

eliminate low complexity sequences using the XBLAST masking program (Clavene 
"Effective Large-Scale Sequence Similarity Searches," In: f^M^M 

A^dlrnTc Press, NY, NY (r9%7see particularly Clavene, in "Automated DNA 
25 tqu nlg and Analysis Techniques" Adams et al., eds., Chap. 36, p. 267 Academic 
P ess San Diego, 1994 and C.averie et al. Compu, CHem. (1993) 77:191 ). Generally 
.asking does not influence the final search results, except to 

relative little interest due to their low complexity, and to eliminate multiple hits based 
on similarity to repetitive regions common to multiple sequences, e.g., Alu repeats. The 
30 equences remaining after masking were then used in a BLASTN v, Genbank sear 
sequences that exhibited greater than 70% overlap, 99-Zo identity, and a p value of I s 
than 1 x 10- were discarded. Sequences from this search also were discarded ,f the 
inclusive parameters were met, but the sequence was ribosomal or vector-derived. 
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The resulting sequences from the previous search were classified into 
three aroups (1 2 and 3 below) and searched in a BLASTX vs. NRP (non-redundant 
^ri-e search: (1) unknown (no hits in the Genbank 
similarity (greater than 45% identity and p value of less than 1x10) and (3 high 

*. 60 % - *(.% ^ « ^ 

, tv,=.n 70% overlap greater than 99% identity, ana p 

x 10*). Sequences having greater than 7U/o overlap, grca 

value of less than 1 x 1 0" 40 were discarded. 

The remaining sequences were classified as unknown (no tats), weak 

less were discarded. Sequences with a p value of less than 1 x .0 when 

compared to a database sequence of human origin were also excluded. Second, J 
BLASTN vs Paten. GeneSeq database was performed and sequences havmg greater 
L identity, P va,ue ,ess man , x .0- - 8— than 99% overlap were 

The remaining sequences were subjected to screemng using other .tries 
and redundances in the dataset. Sequences with a p value of less «han 1 x 10 m 
relation to a daubase sequence of human origin were specf.caUv excluded The final 
result provided the 3351 sequences listed in the accompanymg Sequence L,s„ng. Each 
enfified polynucleotide represents sequence from a, leaa, a partial m^A — 
Polynucleotides that were determined to be novel were ass,gned a sequence 

identification number. . 

The novel po.ynuc.eotides were assigned sequence ident.ficat.on number 
SEO ID MO.-1-33S1. The first 1847 DNA sequences corresponding to the novel 
el deo^id s are provided in the Sequence Listing in Table 1. DNA sequences 
polynuc.eo.ioes are p xjOs-1848-3351 are provided in the 

corresponding lothe novel polynueleot.des of SEQ lDNOs.HW J« P 
Sequence Listing in Table 2. The DNA sequences of T.b,e 2, wh.le numbed SEQ ID 
1504 correspond to SEQ ID NOs:1848-3351 in .he Sequence Luang, e.g., Table SEQ ID 
, I SEqTd NO:1848, Tahle 2 SEQ ID 2 is SEQ ID NOT849, e,. Each DN A sequence , m 
T bleTis uniquely iden.if.ed by a number that is ,847 less man i,s SEQ ID NO m .he 
Lquence Lis,ing. Tables 1 and 2 provide: 1) .he SEQ ID NO assigned .o each sequence 
or use in me pin, specification or a eormsponding number; 2) me sequence name u ed 
.Timer™, den.ifier of the sequence; 3, the name assigned to the Cone from wh,ch me 
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sequence was isolated; and 4) the number of the Custer to which the sequence is assigned 
(Cluster ID; where the cluster ID is 0, the sequence was not assigned to any cluster). 

Because the provided polynucleotides represent partial mRNA 
transcripts, two or more polynucleotides of the invention may represent different 
5 regions of the same mRNA transcript and the same gene. Thus, if two or more SEQ ID 
NOs: are identified as belonging to the same clone, then either sequence can be used to 
obtain the full-length mRNA or gene. 

EXAMPLE 2 

Results of Publ.c Database Search to Ident.fy Funct.on of Gene Products 

SE Q ID NOs- 1-3351 were translated in all three reading frames to 
determine the best alignment with the individual sequences. These amino acid 
sequences and nucleotide sequences are referred to, generally, as query sequences, 
which are aligned with the individual sequences. Query and individual sequences were 
15 aligned using the BLAST programs, available over the world wide web at 
http //www.ncbi.nlm.nih.gov/BLAST/. Again the sequences were masked to various 
extents to prevent searching of repetitive sequences or poly-A sequences usmg the 
XBLAST program for masking low complexity as described above in Example 1 . 

Tables 3 and 4 (inserted before the claims) show the results of the 
20 alignments. Table 3 contains alignment information for SEQ ID NOs:l-1847 and Table 4 
contains alignment information for SEQ ID NOs: 1 848-335 1 . The DNA sequences of Table 
4 while numbered SEQ ID 1-1504, correspond to SEQ ID NOs: 1848-3351. Each DNA 
sequence in Tab.e 4 is uniquely identified by a number that is 1847 less than its SEQ ID 
NO Tables 3 and 4 refer to each sequence by its SEQ ID NO or a correspond.ng number, 
25 the accession numbers and descriptions of nearest neighbors from the Genbank and Non- 
Redundant Protein searches, and the p values of the search results. 

For each of SEQ ID NOs:l-1847, the best alignment to a prote.n or DNA 
sequence is included in Tab.e 3, and the best alignment for each of SEQ ID NOs. .848- 
3351 is included in Table 4. The activity of the polypeptide encoded by SEQ ID 
30 NOs l-3351 is the same or similar to the nearest neighbor reported in Table 3 or 4. The 
accession number of the nearest neighbor is reported, providing a reference to the act.vt.es 
exhibited by the nearest neighbor. The search program and database used for the al.gnment 
also are indicated as well as a calculation of the p value. 
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Full length sequences or fragments of the polynucleotide sequences of 
the nearest neighbors can be used as probes and primers to identify and isolate the full 
length sequence of SEQ ID NOs:l-3351. The nearest neighbors can innate a tissue or 
cell type to be used to construct a library for the full-length sequences of SEQ ID 
NOs:l-3351. 



EXAMPLE 3 
MEMBERS OF PROTEIN FAMILIES 



The sequences (SEQ ID NOs:l-3351) were used to conduct a profile 
10 search as described in the specification above. Several of the 

invention were found to encode polypeptides having charactenst.es of a polypeptide 
belonging to a known protein families (and thus represent new ^*J h ™ 
protein families) and/or comprising a known functional domam (Table 5). S art ™d 
"stop" in Table 3 indicate the position within the individual sequences that align wrth 
15 the query sequence having the indicated SEQ ID NO. The direction ind.cates the 
oriental of the query sequence with respect to the individual sequence where 
forward (for) indicates that the alignment is in the same direction (left to ngh ) a, h 
sequence provided in the Sequence Listing and reverse (rev) indicates that the 
alignment I with a sequence complementary to the sequence provided ,n the Sequence 

20 LiStin8 ' Some polynucleotides exhibited multiple profile hits because, for 

example, the particular sequence contains overlapping profile regions, and/or the 
sequence contains two different functional doma.ns. These profile hits are described in 

more detail below. Q 
An w R .n M ,s TANK). SEQIDNOs:187, 1268, 1804, 1819, 1830, 1839, 
2652 3015 and 3267 represent polynucleotides encoding an Ank repeat-contammg 
protein. The ankyrin motif is a 33 amino acid sequence named for the protein ankyrin 
which has 24 tandem 33-amino-acid motifs. Ank repeats were originally identified in 
the ceU-cycle-control protein cdclO (Breeden et a.., Nature (1987) 329:651). Proteins 
containing ankyrin repeats include ankyrin, myotropin, I-ka P paB cell cycle 

protein cdclO, the Notch receptor (Matsuno et a.., Development (1997) 124(21)A265), 
G9a (or BAT8) of the class III region of the major histocompatibility complex 
tBiochem J. 290:811-818, 1993), FABP, GABP, 53BP2, Linl2, glp-1, SW14, and 
SW16. The functions of the ankyrin repeats are compatible with a role m protein- 

$1 



25 



30 



WO 01/02568 



PCT/USOO/18374 



\ 



protein interactions (Bork, Proteins (1993) 77(0=363; Lambert and Bennet, Eur. J 
Biochem. (1993) 277:1; Kerr et al., Current Op. Cell Biol. (1992) 4:496; Bennet et al.. 

J. Biol. Chem. (1980) 255:6424). 

,vTP, gPg Associate^ with Various rellnhr Activity (ATPases). 
Sequences within SEQ ID NOs:431, 639, 2135, 2684, 2859, 3197 and 3266 correspond 
to a sequence that encodes a novel member of the "ATPases Associated with diverse 
cellular Activities" (AAA) protein family. The AAA protein family is composed of a 
large number of ATPases that share a conserved region of about 220 amino acids that 
contains an ATP-binding site (Froehlich et al., J. Cell Biol. (1991) 774:443; Erdnwm et 
al Cell (1991) 64:499; Peters et al., EMBO J. (1990) 9:1757; Kunau et al., Biochxmie 
(1993) 75 209-224; Confalonieri et al., BioEssays (1995) 77:639; 
h ttp-//yeamob.pci.chemie.uni-tuebingen.de/AAA/Description.html). The proteins that 
belong to this family either contain one or two AAA domains. In general, the AAA 
domains in these proteins act as ATP-dependent protein clamps (Confalomen et al. 
(1995) BioEssays 77:639). In addition to the ATP-binding W and 'B' motifs, which are 
located in the N-terminal half of this domain, there is a highly conserved region located 
in the central part of the domain which was used in the development of the ^signature 
pattern. The consensus pattern is: [L IVMT]-x-[LIVMT]-[LIVMF]-x-[GATMC]-[ST]- 
rNS]-x(4)-[LIVM]-D-x-A-[LIFA]-x-R. 

p^^n rhrnmodomain). SEQ ID NO:1814 represents a 
polynucleotide encoding a polypeptide having a bromodomain region (Haynes et al 
1992 Nucleic Acids Res. 20:2693-2603, Tamkun et al., 1992, Cell 68:561-572, and 
Tamkun, 1995, Curr. Opin. Genet. Dev. 5:473-477), which is a conserved region of 
about 70 amino acids. The bromodomain is thought to be involved in protein-prote.n 
25 interactions and may be important for the assembly or activity of multicomponent 
complexes involved in transcriptional activation. The consensus pattern, which spans a 
major part of the bromodomain, is: [ STANVF]-x(2)-F-x(4)-[DNS]-x(5,7)-[DENQTF - 
Y-[HFY]-x(2)- [LIVMFY]-x(3)-[LIVM]-x(4)-[LIVM]-x(6,8)-Y-x(12,13)-[LIVM]-x(2)- 

N-[SACF]-x(2)-[FY]. 

30 n»,\c Repion Pi»« Leucine Zio r-r Transcription Factors (BZIP) . bbg 

lDNOs:410, 552, 768, 822, 836, 1288, 1365, 1454, 1540, 1549, 1556, 1557, 1563, 
,622 1630, 1704, 1808, 2363, 2424, 3147, 3152, 3158 and 3208 represent 
polynucleotides encoding a novel member of the family of basic region plus leucine 
zipper transcription factors. The bZIP superfamily (Hurst, Protein Prof. (1995) 2:105; 

35 and E.lenberger, Cur, Opin. Struct. Biol. (1994) 4:12) of eukaryotic DNA-binding 
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i r ^^pN^{^WyW-^NSTCl-[DNQGHRK]-{OP}-lLlVMq- 

IDENQSTAOC £SSSrs EQID NO:,Sn ^sen.sapoiynocieofide 

encoding a po.^dT^T^ana, homoiogy in ETS domain r-otems o tos 
encoamg a puiy^F "PTS-domain " that is involved in DNA 

0 family contain a conserved domain, the ETS-domain, 

7 27— (.993) «,:7.» The * gene family encodes a novel Cass of 
DNA-btding pm,ei„s, each of which binds a specific DNA sconce and compos an 
!5 domain £ specific* -erncis wi,h .he — « £ 

^ r-r;A in addition to an ets domain, native en piuitu r 

gro l, differentiation and development, and three members are imphcated 
30 oncogenic process. represents a 

polynocleodde encod,nga novo, ^^^J^T^ 

35 Ii:eUn,: lcZ, such as ion channels and enzymes « «, ,e »f 
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, i- „i. ;n th^ resting state, associate as a mmer at me uuisi 

gamma) which, in the resting state, G -protein alpha 

membrane. THe alpha snbnnit binds OTP and exhibits GTPase activity^ p P 
snbnnits are 350-400 amino acids to length and have molecular weights in the range 40 
5 45 iventeen distinct types of alpha snbont, have been identified m — 1 , 

1 fall into 4 main gronps on the basis „ f bod, seance sim = fime onatoha 

te imporJ. tor membrane association and high- affinity interactions 

NOs-1496 2826 and 2871 represent polynnelentides encoding novel members of the 

DEAD/H helicase family. A nomber of enkaryotic and proXaryotic proteins have b«n 
DEAD/H hehe^ tarn y ^ ^ p > a ^ 

" "Zl - - '-,ved in ATP-dependent, nneleic^c, n— , 

A „ DEAD bo* family members of the above proteins sh^re an m* f «« 
sequence motifs, some of which are specific to the DEAD family 
shied by other ATP-btodtog proteins or bv prctems belonging » 
20 •snnerfamiiy (Hodgman T.C., Mm.™ (1988) Mi:22 and Mure (1988) W.578 
20 supertamuy in b n.A-D-box", represents a special version of 

rprratat One of these motifs, called the D-fc-A-u dok. , i^f r 

nl^ATP. b in di n g proteins. Some other proteins 

have His instead of the second Asp and are thus said to be "D-E-A-H-box protems 
have His insiea 349.463. Harosh I., et al., Nucleic Acids Res. 

(Wassarman D.A., et al., Nature ^ l \ 34 ' A6 \ „. 9g9> ^ follo wing 

25 (1991) 79:6331; Koonin E.V. et al., J. Gen. Virol. (1992) m 

ignature patterns are used to identify members of both subf^h es: ° 
E A-D-[RKEN]-x-[LIVMFYGSTN]; and 2) [GSAHJ-x-tLIVMPlO^D-E-tALlV] H 

tNECR] - H^to*mi^^ SEQ ID NOs:l676, 1820 and 1821 

30 represent polynucleotides encoding proteins having a homeobox domam. The 
homeobox is a . protein domain of 60 amino ac.ds (Gehring In: 

^ L 1 tw ca r\r\ i 10 Oxford Un versity Press, Oxford, (1 
Hnmeobox Genes, DubouleD., Ed., pp. UXIoruuiU 17 . 

^^^U*^^ pp25-72, Oxford University Press, 

Buergnn in. yu - . 17 277-280; Gehring et al., 

Oxford, (1994); Gehring, Trends Bwchem. Sci. (1992) 7.2// • » 
35 Annu. Re, Genet. (1986) 20:147-173; Schofield, Trends Neurosci. (1987) 70.3 6) 
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identified in a nnmbe, of Drosophila homeodc and segmentation proKm, K « 
x^meTy we,, conserved in many other anima,s, induding vertebrate, Tms domatn 
btls DNA lugh a helix-mm-hefix type of s„uc,ure. Severn, protems ma, — a 
irdomain p,ay an important ro,e in deve.opment. Most of ttrese p~e 
seqoence-specific DNA-binding transcription factors. The homeobox domatn ,s also 
v ' s m, r» a region of .be yeas, mating type protein, Tbese are ^ence-sPectfi 
XbTndmg proteL ,ba, act as master switches in yeas, different by con,ro„.ng 
opne exnression in a cell type-specific fashion. 

gene express 0 ^ ^ ^ ^ ^ ^ low . 

The he,ix-tum-he,ix region is shown by the symbois W (for heiix), and t (for mm). 

xxxxxx«xxxxxxxxxxx«xxxxx«xxxHHHHHHHHtttH«HHHHH«HXXXXX«xxx 

1 

The pattern detects homeobox sequences 24 residues long and spans 
v 34 to 57 of the homeobox domain. The consensus pattern is as follows: 

iUVFSTNK.WPYVC:-* , 3 , ?5 ^ 

MAP kinase kinase (mkk). SEQ 1L) INUS.zy, ji, . oc 

3281 represent novel members of the MAP kinase kinase family. MAP kinase 
ZpK) are involved in signal transduction, and are important in cell cycle and ceU 
(MAP, ) t . - MAP kinaS e kinases (MAPKK) are dual-specificity protem 

zi ;:r r— ^ ^ 

been found in yeast, invertebrates, amphibians, and mammals. Moreover, , he 
i mIp^MAPK phosphorylation switch constitutes a basic moduie acvated ,n dtstmc, 
, MArwuMftr v u , x/APKICi are essential transducers through which 

pathways in yeast and in vertebrates. MAPKKs are essential Rioloeique Biol 

signals must pass before reaching the nucleus. For review, see , g . B * 
Cell (1993) 79:193-207; INishida et al., Trends Biockem Sa (1993) 75.128 3 , 

^ n roll Kiol H993) 5207-13; Dhanasekaran et al., Oncogene (1998) 
Ruderman, Curr Opm Cell B ol ^(1993) 5.207 &// 

0 7 7:1447-55; Kiefer et al., Biochem Soc Trans (I wi) > 

(1996)5:533-44. SEO ID NOsl 157, 1478, 1496, 2286, 2969 

Prntein Kinas e (protkinase). SEQ iL> Mus. , 
and 3,90 represent polynucleotides encoding protein kinase, Protein kinases cataly* 
CU-n of protetns in a vanety of padtways, 

Eukaryolic protein kinases (Hanks S.K,e, a,., f^SEB J. (1995)9.5/0 nu , 
^1. (199,) MM; Hanks S.K., e, a,., Mem. (1991) 28*3* Hanks S.K., 



35 

£«zymo/. (1991) 200:3; Hanks S.K., et al. 
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Curr Op* S.ruc. Biol (.991) ,.369; Hanks S.K. et a.., Science (1988) 24, A1> are 
1 y£L belong .0 a very extensive family of proteins which share a observe 
JU — — to bo.h serine/threonine and tyrosine protetn kmase, T*re are 
a number of conserved regions in the catalytic domain of proiem ta>» The firs 
"on which is located in the N-termina! extremity of the cattdyhc domatn ts a 
Z Irich stretch of didoes in the vicinity of a lysine residue, which has been shown 
*bXofv "n ATP binding. The second region, which is iocated in the centra, par, 

f ::;iy„c a «-«- ™ 

for the catalvtic activity of the enzyme (Knighton DR. et al., Scence (1991) 255 407). 
^e p ^ kinase profile includes ,wo signage panems for ,his - 
specific for seri„e„hreonine kinases ^d the "7™^^^ 
is based on the alignment in (Hanks S.K. et al., FASEB J. (1995) SO K» 

MiM °T~s patterns are as follows: 1) ^1-)- 

rUVMFYWCSTAR]-[AIVP]-[LIVMFAGCKR]-K, where k 
UVWYCl-x-^x-D-^ 

site residue; and 3) [LlVMFYC]-x-[HY]-x-D-[LlVMFY]-[RSTAC) x(2) N 
rr 1VMFYC1 where D is an active site residue. 

' If. protein analyzed includes two of the above protein k.nase signatures, 

fteprobabilityofitbeingaproteinkinaseisclosctolOOV.. 

p.. f . m ,lv nroteins (ras). SEQ ID NOs:1688 and 3258 represent 

CValenca e, ah, ,991, Biochemistry 30:4637-4,48). R* . &mdy members 
; generally require a specific guanine nucleofide exchange facor (GEF) and a specltc 
OT s ucfivafing Protein (GAP) as stimulators of overall GTPase aevty. Among 
ra,re,a.ed proteins, the highest degree of sequence conservation ,s found 
regions mat are direcdy involved in guanine nucleottde b.nd.ng. Th firs two 
constipate most of the phosphate and Mg2 + binding site (PM s„e> and are located ,» *e 
0 I , half of the G-domain. The other two regions are involved in guanos.ne bmdmg and 
2 in ,he Cerminal half of ,he molecule. Modfs and conserved sm.cn.ral 
Tatul of the ms-related proteins are described , Valencia e, al., £™ 
30:4637-4648. A major consensus pattern of ras protems .s: D-T-A-G-Q-E-K-IU-J o 
G-L-R-[DE]-G-Y-Y. 
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Thi^dn^^ SEQ ID NO:1677 represents 

LnXed amino- acid residues which partite in various redox reachons v.a .he 

n — - r - r — - - - 

— W ° ^ ^X: iL .vM F vc,. [ P vwst„ E >x( 2 Hfvwo™k- 

[GATPLVEHPHYWSTA]-C-x(6)-[L[VMFYWTJ (where the .wo Cs form ft. redox- 

15 aC " Veb0nd) - T QE ri IL i M! «. SEQ ID NO:,4,0 corresponds .o a novel serine 
Dr „,ease of — aCiviby of .he serine proteases from . he 

^ r^i,, is P-ided hv a charge relay s y s.em involving - asparhc «d»£ 

ponded .„ a "JZZ. - 

sequences m the vicinity of the active sue , 54-528} The 

» conserved in mis famiiy of proves ^ 

consensus pahems for .his ,ryps,n ^^^S^HW»<MPE1- 
where H is the active site residue; and 2) [DNS 1 AUM IV^ l * j 
r™SAPHV]- [LIVMFYWH]-[L1VMFYSTANQH], where S ,s .he achve s„e 

STi: sejnces .own » ^ * I- 

signatures, UK probability of i. being a trypsin family serine protease ts 100/. 
g VrU ,in n.-.-p.lW)^). SEQ lDNOs.1336 , 1380 

,7,1 1762 1909, 2218, 3047, 3108 and 3292 represen. novel members of the WD 
l/ii, i/oz, '™ 7 > . of the three subunits 

30 domain/G-beta repeat family. Beta-transducin (G-beta) 1S one ot tn 

(alpha beta and gamma) of the guanine nucleotide-binding proteins (G proteins) which 
c t as' ^'rmediaries in the transduction of signals generated by — bran 
receptors (G^man, Annu. Re, BtocHem. (1987) 56:615). The a. P ha subunit in^o 
Z ^ hydro Us GTP; the functions of the beta and gamma subunits are less clear but 
and hyaroiyzeb , f membrane 

35 they seem to be required for the replacement of GDP by U 1 P as w 



PCT/US00/18374 



WO 01/02568 



10 



15 



M and receptor rec„ g ni,io, ,n hi g her "^m^i « 
m „m S e„e fami.y of hW* " f LT4O residoes, each 

Struck, G-be B consists of etght '^ ^.^^ calted . WD-40 

«— .^^^rr^TlUH.-. repeat fami* is, 

■ Wnt 1 fnrevious y known as int-1), tne senium 
signaling proteins. Wnt-1 ^(prev c y ^ & ^ ^ intercellular 

family, (Nusse R„ I*** ( V" a molecule important in the development of 

communication and seems to be a signalling mo ecu* mpo ^ 

the centra, nervous system (CNS). Ml N-glycosylation 
characteristics of secretory proteins: a signal pepude se p ^ ^ 

- - d 22 co rr a r r ^ — of - - - are 

based upon a highly conserved regton .nclndmg three cyste, 

^MUVmH^C^^^^^^^ SEQ ID NO,4,7 
5 EE!!e!lJi ?^"l^ Tyrosine specrfic 

represent a Hynncieohde -^^L e, a,., ««- (•».) »*»•: 
protein phosphatases (EC ^ 3 «> P ' 8;463; abridge, J. «W. CHcm. 

Charbonneau e. .1., <«■ <>« f »' (19 *\ :497 . ^ Hunter, OH 

„„„ 2dd:235,7 ; Torn, e, a,., TV- ^" a f ^ a «ached ,0 a tyrosrhe 
S („ 8 9) 5 S :,».3) cataiyze the ^J***^^ 8rowt b, proration, 
residne. These enzymes are very tmponan have ^ chaia «erized 

differentiation and transformation. MuitrpU „ brane reeeptor 

- ZZZSZ — " 1 — -~ are 

proteins that contain PTPase domain^ transme mbrane region 

,„ Lde np of a varrabie ,ength e—ar domam £ owed by ° ^ ^ ^ 

and a C-termrnal catalytie cytoplasmte domam FIW >dom 

amioo aeids. The seareb of two ^^ ^r^L in ins immediate 
reared for activty. Fnrthermore, a number o, — ^ S fm m ^ is: 
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. ^^^ ?H , Typf (7 , inc f,„ 8 am- SEQ ID NOs.308, 807 
,324 1503 1527, 3081, 3,93 and 3306 correspond ,„ polynucleotides encoding novel 
'ZZ of the o, the C2H2 type ,nc finger prorein ^ 
e, al 7>e„* B,och em . &/. (1987) 72:464; Evans e. a,., CM (1988) 52.1, Pay« et ah, 
™L (1988) 234:245; Miller e. al., EM/30./. (.985) 4:1609; and Berg, ft™. No , 
W S USA ( 988, S5 :99) are nncleic acid-binding protein structures. In addition to 
*e onlved ^e land residues, 1, has been shown that a number or other prions 
2 ZTmponan. for the struma! integrity of Ore C2H2 .no finger 
7sL, S^. Oy, (.993) »:557> The best conserved posinon is ound fou 
^a^eseco^^ 
consensus pattern for C2H2 zinc iingcis is 
vHSVH The two Cs and two H's are zinc ligands. 

X( ' ' S_ ! zJ* m ^. SEQ ID NOs:,86, 2591, 3307 and 3339 represen 

polynucleotides encoding novel members of tbe family of Src homology 2 SH2 
• ,-ins The Src homology 2 (SH2) domain is a protein domain of about 100 amino 
' S ^f,i as a conserved seouence region between the oncoproteins 

Z Z Fps (Sadowski 1. e, a,., Mo,. «oL d-4396-4408 «™^ r ^ 

containing target peptides in a scquc f~ p awS on 

aetices and six to seven beta-strands. The core of die domain is formed by a, =on,,„uous 
beta-meander composed of two connected beta-sheets (Kenyan J., Cowbum , D., Curr. 
£ ZL. Bio,. 3:828-837,1993),. The profile to detect SH2 domains is ^ed on 
sltura. alignment consisting of 8 gap-free blocks and 7 linker regions totaling 92 

30 ^^"L^gy-i SEQ .O NQ:234, ,832, and ,835 represent 
polynucleotides encoding novel members of the family of Src homology ^ 3 (SH3, 
LLins The Src homology 3 (SH3) domain is a small protein domain of about 60 

35 several cytoplasmic protein tyrosine kinases {e.g., Src, Abl, Lck) (Maye 
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Nature 332:212-215 (1988)). Since then, it has been found m a great va„ety of oth r 
intracellular or membrane-assodated proteins (Musacchio A. et al., FEBS Lett. 307S5- 
6 1 (1992); Pawson T., Schlessin g er ,, Curr. Biol. 3:434-442 ( 993); Mayer B. 
Baltimore D., Trends Cell Biol. 3:8-13 (1993); Pawson T., Nature 373.573-580 (1995)). 

The SH3 domain has a characteristic fold which consists of five or six 
beta strands arranged as two tightly packed anti-parallel beta sheets. The : linker regions 
may contain short helices (Kuriyan J., Cowbum D., Cur, Opt, Struct. B,ol. 3.828-837 

(1 " 3)) ' The function of the SH3 domain may be to mediate assembly of specific 
protein complexes via binding to proline-rich peptides (Morton C.J.. Campbell I.D., 

Curr. Biol. 4:615-617(1994)). 

In general SH3 domains are found as single co pl es in a given protein, but 
there are a significant number of proteins with two SH3 domains and a few with 3 or 4 

15 COPiCS ' m^^_UL SEQ ID NOs:746 and 1192 represent 

polynucleotides encoding novel members of the family of fibronectin type III protein, 
A number of receptors for lymphokines, hematopoeitic growth factors and growth 
^one-related molecules have been found to share a common ending ^domain. 
(Bazan J.F., Biocnem. Biopky, Res. Commu, 164:1^-195 (1989); Baza. J.F P£ 
Natl. Acad. Set. U.S.A. ,7:6934-6938 (1990); Cosman D et al Trends B^ Sc. 
75:265-270 (1990); d'Andrea A.D., Fasman G.D., Lodish , H F ^' 5 * 1( £ '™ 
(1989); d'Andrea A.D., Fasman G.D., Lodish H.F., Cur, Opm. Cell Btol. 2.648-651 

(1 " 0)) ' The conserved region constitutes all or part of the extracellular ligand- 
binding region and is about 200 amino acid residues long. In the N-terminal o this 
domain there are two pairs of cysteines known, in the growth hormone receptor, to be 
involved in disulfide bonds. 



20 
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Two patterns detect this family of receptors. The first one is derived 
from the first N-terminal disulfide loop, the second is a tryptophan-rich pattern located 
at the C-terminal extremity of the extracellular region. 
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A consensus fo, this protein famdy * C-[LVFYR]-x(7,8MST.VDN,.C- 
x-W (The two C's are linked by a disulfide bond]. A second consensus for «h.s pro.cn 

^ lST !S^i— - SEQ ID XGs:,2o9, -9, ,3,0, and 
,386 represent po,ynuc,eo,id.s encoding nove, members of.be family of UMta- 
olining pro,eins. A number of pro.eins con.ain a conserved cysteme-nch dnma,n of 
r 7am no-acid residues. (Freyd G. e. a,, N al ure 344:876-879 (,990); Ba,tz R. e 
t plan, a,l 4:,465-1466 (,992); Sanchez-Garcia ,., Rabbins T.H., Trends Gene,. 

„ ' 0:3W20 (1 T„1 UM domain, .bere are seven conserved cy,ei„e rcsidu es and a 
hislidinc The arrangement followed by these conserved residues is C-x(2)-C- x(16,23)- 
H ^ CH]-x(2)-C xf^^^^C-^WHCHD]. The L,M domain b.nds two zmc 
"ns Ssen .W. e. »,., Froc. Acad. Sc, V.S.A. 9*4404-4408 (,993),. L M 
L.L bind OHA, ramcr it seems . ac, as interface * 

5 The consensus for this protein family is. C-x(2)-C x(n,zi) L r 

^-G^-G^,-™. m NO , 1325 ^ 2282 

r? Homain ( protein kinase C like), bt.^ ^ 
represent polynucleotides encoding nove, members of the family of C2 domam 
o«a nt g plins. Some isozymes of protein kinase C (PKC) enntatn a domam, 
20 ZZZ C2 of about „6 amino-acid residues, which is ,oca,ed bepveen the two 
TZ oHhe C, domain (that hind pborho, esters and diacy,g,ycerol) anc One protem 
*Z catalytic domain. (Azzi A. e« a,., Sr. / — 2u 8 :547-557 (,992); Stabe, S., 

Semin Cancer Biol. 5:277-284 (1994)). 

The C2 domain is involved in calcium-dependen, phospho„p,d b.ndmg 
25 (Davletov B.A., Suedhof T.C., J. BioL Ckm. 2do:26386-26390 (.993)). Smce 
del " related to the C2 domain are a,so found in proteins una. do no, b,nd ca,c.um, 
ler pu<a.ive fi.nc.ions for .he C2 domain include hmdmg ■»«-*»•'■*■ 
°ribospba.e <Fn k udaM.,e,»,.,..«.C te n..2d9:29206.292„(,994)., 

' Tie consensus pa.,em for fire C2 domain is ,oca,ed in a conserved pad 

30 of ,ha. domain, Una co„nec,ing loop between hem s,rands 2 and 3 The P^~^ 
domain covers Ute <o.al domain. The consensus for ,h,s pro<e,» fam.ly «.. [ACG W 2) 
L-x(2,3).D- X (l,2)-rNGSTLIF H GTMR]-x-[STAP]-D-[PAHFY) 

c. ri „. r m,eases. P IT"- achyiLJffiS. SEQ ,D NO.14,0 

represen.s a polynudeonde encoding a novel member of ,he family of serine prmeasc 
35 TJpsin proJns The ca,a,y.ic ac.ivi,y of ,he serine promases from .he .rypsm fam.ly .s 
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provided by a charge relay system involving an aspartic acid residue hydrogen-bonded 
to a histidine, which itself is hydrogen-bonded to a serine. The sequences in the v.imty 
of the active site serine and histidine residues are well conserved in this fiumly of 
proteases (Brenner S., Nature 334:528-530 (1988)). 
5 A consensus for this protein family is: [LIVM]-[ST]-A-[STAG]-H-C [H 

is the active site residue]. A second consensus for this protein family is: [DNSTAGC]- 
[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]-[LIVMFYWH]- 

[LIVMFYSTANQH] [S is the active site residue]. 

PM a p^ pnition Motif Domain (RR M RBP, or RNP), SEQ ID NOs. 
10 1464 and 1514 represent polynucleotides encoding novel members of the family of 

RNA recognition motif domain proteins (Bandziulis R.J. et al., Genes De, 3:431-437 

(1989); Dreyfuss G. et al., Trends Biochem. Sci. 73:86-91 (1988)). 

Inside the putative RNA-binding domain there are two regions which are 

highly conserved. The first one is a hydrophobic segment of si, residues (which ms 
15 dd the RNP-2 motif); the second one is an octapeptide motif (which is called RNP-1 

or RNP-CS). The position of both motifs in the domain is shown in the following 

schematic representation: 

xxxxxxx####xxxxxxxxxxxxxxxxxxxxxxxxxxxxx#W 
20 RNP-2 RNP_1 

As a consensus pattern for this type of domain the RNP-1 motif was 
used. The consensus for this protein family is: [RK]-G-{EDRKHPCG}-[AGSCI]- 

[FY]-[LIVA]-x-[FYLM] „ ninNn . 

pwr^iHv.inosito'-T^fir nhnsnholinase C. Y Domain. SEQ IP NO. 

1707 represents a polynucleotide encoding a novel member of the phosphatidyhnositol- 
specific phospholipase C, Y domain family of proteins. Phosphaudyhnositol-specific 
phospholipase C (EC3.1.4.1 1), a eukaryotic intracellular enzyme, plays an important 
role in signal transduction processes (Meldrum E. et al., BiocMm. B.opnys. Acta 
1092-49-11 (1991)). It catalyzes the hydrolysis of 1 -phosphatidyl-D-myo-inosito l- 
3 4 5- triphosphate into the second messenger molecules diacy.glycerol and mosito - 
, Vs-triphosphate. This catalytic process is tightly regulated by revers.ble 
pUsphory.ation and binding of regulatory proteins (Rhee S.G., Choi ^ Ad, Second 
Meslser Fetoprotein Res. 26:35-61 (1992); Rhee S.G., Choi K.D., J. B.oL C^ 
267:12393-12396 (1992); Sternweis P.C., Smrcka A.V., Trends Biochem. Sc, 77.502- 
506 (1992)). 
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All eukaryotic PI-PLCs contain two regions of homology, referred to as 
"X-box» and "Y-box". The order of these two regions is the same (NH2-X-Y-COOH), 
hut the spacing is variable. In most isoforms, the distance between these two reg.ons , 
only 50-100 residues but in the gamma isoforms one PH domam, two SH2 doma ns, 
5 ^one SH3 domain are inserted between the two PLC-specific 

conserved regions have been shown to be important for the catal yUc acUvU . A he C- 
terminal of the Y-box, there is a C2 domain poss.bly involved m Ca-dependent 

membrane attachment. represents a 

g P rine Carboxvpentidases. SEQ 1L> NU.im y 
10 polynucleotide encoding a novel member of the serine carboxypeptidases family of 

pjeins Carboxypeptidases may be euher metallo carboxypeptidases or senne 
protems. yv\ v of ^ serme 

:: 1:1 u — ,~ « rT * . 

,5 which is itself hydrogen-bonded ,o . serine (Liao D.I., Remington S.J., J. B,o,. Chem. 

^ 6528 - 653i rZences surrounding the active s,,e serine and hisridine didoes 
are highly conserved in all these serine carboxypeptidases. A consensus for this protein 
are nign y v.[ AOl-IGSl [S is the active site residue]. A second 

family .s: [L,VM]-X-[GTA]-E-S-Y [AO] ^ (lvpsTMGSDNQ L]. 
20 consensus for this protein family is: [LIVF]-x(2)lLtvaiAj!ti I 

[SAGV]-[SO]-H. X - [lVAQ]-P-x(3HPSA, [H is ^ *.r^O ID NO.8,8 

dan poubi^sta nded ■>"* p Mo " f - SEQ ,D no15 '° 

represents a polynucleotide encoding a nove. member of the dsrru double-stranded 
RN A binding motif proteins, .n euKaryotie culls, a multihide of RNA-binding proteins 
25 Z key les in the postrtanscriptiona, regulation uf gene expression. Charactenzahon 
o These proteins has led to the identification of several RNA-binding moffs. Severn 
human and other vertebrate genetic disorders are caused by ° f 
RNA-binding proteins. (C. O. Burd * G. Dreyfuss, Seance 265: 

Proteins containing double stranded RNA bindmg motifs bind to specific 
30 RNA targets. Double stranded RNA hinding motifs are exemplified by interferon- 
ind ueed protein kinase in hutnahs, which is part of the cellular response to dsRN A 

SEQ ID NOs:2577, 3183 and 3195 encode members of the 4 trans 
membrane integral membrane protein family. This family consists of type .11 proteins 
which are integral membrane proteins that contain a N-,erminal membrane-ancho ring 
35 domain tha, is no. cleaved during biosynthesis, and which functions as a trans.ocatton 
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20 



signal and a membrane -ho. The proteins a.so have three actional bmne 
regions. The consensus pattern is: 0-x<3 H LIVMF)-x(2>[GSA].[LlVMF, (2>G-C 
[C,Al-rSTAl-x(20-[eG>x(20-[CwNHLIVMK2). 

11 l SE Q ,d NO-2944 encodes a polypeptide having a calpam large subumt, 
H ,in III Calnains are a family of intracellular proteases that play a vartety of 

nrus! and plays a role ,„ limb-girdle museular dystrophy type 2A. (Sonmaeht, H. 
a,., Biochem. » ^ ^ encode havi „ g . C3HC4 »pe 

*„c fmger domain (RING fnger), which is a eysteine-rieh domatn of 40 to 60 rescues 

interactions. Mammalian protems of this family include V<u> 

activating protein, which activates the rearrangement of "^^^ ^ 

receotor genes- breast cancer type 1 susceptibility protein (BRCA1) bmt P«*o 

receptor genes, o expressed in a variety of 

oncogene; cbl proto-oncogene; and mel-18 protein, wm v 

tum or cells and is a transcriptional repressor that recognizes ^ 

sequence The consensus pattern is: C-x-H-x-[LIVMFY]-C-x(2)-C-[LIVMYAL 

sequence. ^ & factor Wlth a fork 

head domain of about 100 amino add residues. Proteins of this group are transcnpUon 
J^T^dta. mammalian transcription factors HNF-3-a,pha, -beta, and -gamma, 
tactors, muuu. & , UTT R whirh binds to a region of human T- 

S do™ signaling proteins belong to this gronp of protein, that ^™ 
reneats known as PDZ domains. Several of the protems mteract wtth the C termmal 
ZZ£ motifs X-Ser^hr/X-Val-COO- of io„ channels and/or receptor, (Ponttng, 

CP - Pro,ein ^T N ai 9 5. 7 'eneodes a ^peptide in the family of pnorbo, 
0 esters/glycerol binding protein, Phorbol esters (PE) are analogues of diacylglycerol 
0 esters/glyceroi oin bf activa tes a family of serine-threonme protein 

fDAG) and potent tumor promoters. DAtj activates y 

toase known as protein kinase C. The N-.erminal region of protetn kma* C b nds 
PE^d DAG, and contains one or two copies of a eysteine-rich domarn of about 50 
22 1 residue, Outer proteins having this domain 
,5 the vav oncogene; and N-chimaerin, a brain-spectfic protetn. The DAG/PE btnd.ng 
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domain binds two zinc ions through the six cysteines and two histidines that are 
conserved in the domain. The consensus pattern is: H-x-[LIVMFYW]-x(8, 1 l)-C-x(2)- 
C-x-(3MLIVMFC]-x(5, 10)-C-x(2)-C-x(4)-[HD]-x(2)-C-x(5, 9)-C. 

SEQ ID NO:2216 encodes a polypeptide havmg a WW/rsp5/WWP 
domain The protein is named for the presence of conserved aromatic positions, 
generally tryptophan, as we.l as a conserved proline. Proteins having the domain 
include dystrophin, vertebrate YAP protein, and IQGAP, a human 
protein which acts on ras. The consensus pattern is: W-x(9,l l)-[VFY]-[FYW]-x(6,7)- 

[GSTNE]-[GSTQCR]-[FYW]-x(2)-P. 

SEQ ID NO-2428 encodes a member of the dual specific^ phosphatase 

family, having a catalytic domain, and SEQ IDS NOs:2281 and 2310 encode members 
of the protein tyrosine phosphatase family. These families are related and classed as 
tyrosine specific protein phosphatases. The enzymes catalyze the removal of a 
phosphate group from a tyrosine residue, and are important in the control °f -11 gro^, 
proliferation, differentiation, and transformation. The consensus pattern i, [LIVMF]-H- 
C-x(2)-G-x-(3)-[STC]-[STAGP]-x-[LIVMFY]. 
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Table 1 



SEQ 
ID 


CLUSTER 




3R1ENTATION 


CLONE ID 


LIBRARY. 


1 


377044 


RTA00002676F.p. 1 1 .2. P.Seq 


F 


M00039329A.C0I 


CH09LNL 


2 


377708 


3.TA00002O3 j r.m.u I .-.r.seq 


c 

r 


M00040039A:G08 


CH09LNL 


3 


427782 


RTA00002666F.I.06. 1 P.Seq 


F 


M00032633D:A06 


CHOSLNH 


4 


29372 


RTA000027 1 2F.a.06. 1 .P.Seq 


r 


M000"3"'S _ ' VC02 


CH04MAL 


5 


455003 


RTAO00O2694F.b.0i. 1 .P.beq 


c 
r 


M000434I9D:A10 


CH20COHLV 


6 


380625 


RTA00002634F.d.03.2.P.Seq 


F 


M000401 ISD:G10 


CH09LNL 


7 


450959 


RTA0000269 1 F.b.03.3.PSeq 


F 


M00043306D:B07 


CH17COHLV 


3 


397351 


RTA00002680F. b. 04. 1. P.Seq 


c 
r 


M00039775A:A09 


CH09LNL 


9 


20652 


RTA000027 1 OF.k.O 1 . 1 .P.Seq 


F 


M00O22440B:E0l 


CH03MAH 


10 

r. 


97S30 
373071 


RTA00002663F.k. 13. ! .P.Seq 
RTA0O0O26"0F.j.23. 1. PSeq 


F 
F 


M00022767B:G 1 1 
M00033442A:D06 


CH03MAH 
CH09LNL 


12 
13 


162369 
401247 


RTA000027 1 3F.e.0 1 . 1 .P.Seq 


F 
F 


M00027292D:F10 
M00039508A:CI2 


CH04MAL 
CH12EDT 


14 


430738 


RTA00002669F.1. 1 D.3-.P.Seq 


F 


M00033231D:B09 


CHOSLNH 


15 


46779 


RTA0000271 1 F.c. 14. 1 .P.Seq 


F 
F 


M00022860C:G04 
M0003990^C:G05 


CH03MAH 
CH09LNL 


16 

17 


375772 
4306S9 


RTA0000268 1 F.p.0 1 .2. P. Seq 
RTA00002669F.J.0 1 .3. P.Seq 


F 


M00O33243B:A05 


CHOSLNH 


18 
19 
20 
21 
22 
23 
24 
25 
26 

— 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

4 T 


376546 
430041 
43 1 643 
19422 
376802 
376814 
375492 
379114 
380663 
213817 
375745 
430396 
380462 
430396 
376996 
374846 
379075 
374172 
373104 
186302 
427947 
375130 
377534 
377364 
37634" 
446747 
28092 
378206 
37820(5 
1 4940 


RTA00002677F.d.07.2.P.Seq 
RTA00002667F. t'. 1 7. 1 .P.Seq 
RTA00002669F.1. 16. 1 .P.Seq 
RTA00002709F.C.02.1. P.Seq 
RTA00002677F.C.18.2.P *eq 
RTA000026"4F.h.02. 1 .P.Seq 
RTA000026"7F.m. 19.2. P. Seq 
RTA0000268I F.n.24.2.P.Seq 
RTA00002670F.p. 11.1 P. Seq 
RTA00002664F.'.. 1 9. 2. P. Seq 
RTA00002680F.t.23.1.PSeq 
RTA00002669F.b. 20.4. P.Seq 
RT A000026"0F.o.0 1 . 1 P. Seq 
RTA00002669F.b.20.3.P.Seq 
RTA00002676F.p.l3.2.P.Seq 
"RTA00002b77F.k.l9.2.P.Seq 
RTA00002672F.n.l3.2.P.Seq 
RTA00002673F.k. l6.2.P.Seq 
RTA00002683F.0. 1 ^.2. P.Seq 
RT A00002" 1 5 F.m.2 1 . 1 . P.Seq 
RTA0000:665F.o.0 1 . 1 .P.Seq 
RTA00002673F.J. 1 7. i P.Seq 
RTA00002633F. 1.22.2. P.beq 
RTAO00O:6 T 3F.a.l^.2.P.Seq 
RTA0O0u:675F.l.08 1 .P.Seq 
RTA00002oS Q F.d.l6.2.P.Seq 
RTA00002"! 1 F.l». 12. 1. P.Seq 
RTA000O267 1 F.J. 20. 5. P Seq 
RTAOOOOIo" 1 F.a.20. 2. P.Seq 
RTAOl."Oi::-09r.j.l !.l P.Se'. 


F 
F 
F 
F 
F 
F 
F 
F 
F 

1 F 
F 

L F 

F 
F 
F 
F 
F 
F 
F 

F 
F 

F 

F 

F 

F 

F 

! F 
F 

F 
F 


M00039345CCI2 
M00032790B:A07 
M00033276D:H09 
M00005449B:B10 
M00039344B:G07 
M00039I39C:GI2 
M000394 1S3:D08 
M00039903C:F03 
M000335S1C:HIO 
M00027634A:Ol 1 
M0C039 _ 95D.G06 
M000331S5C D01 
M000335 "0B:E06 
iVt00033IS5C:DO! 
M00039329CBIO 
M00039412D:G06 
M0003905 C >8:E03 
M00039097D.D06 
M000400OSD:GI2 
M000275-3 ! B.C04 
M000324OfB:D02 
M000390c4D:H09 
M 000400 SSC.E 10 
M0C03O43 2C-A0! 
M0003 c >:-i o C:Gl 1 
M00042740A:E09 
M00023032A:B05 
M000335SSC:G04 
MO00335SSC:GO4 
j M0000f6::-A:G02 


CH09LNL 

CHOSLNH 

CHOSLNH 

CH02COH 

CH09LNL 

CHO u LNL 

CH09LNL 

CH0°LNL 

CH09LNL 

CH04MAL 

CHO'-LNL 

CHOSLNH 

CHO°LNL 

CHOSLNH 

CHOULNL 

CHOULNL 

CHOULNL 

CH09LNL 

CHO°LNL 

CHO-MAL 

CHOSLNH 

CHO°LNL 

CHO°LNL 1 

CH0°LNL 

CH0°LNL 

CH 1 5 CON 

CH03MAH 

CHO°LNL 

CHO°LNL 

CH02COH 
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50 



51 



52 



56 



57 



58 



59 



61 



62 



63 



65 



66 



67 



68 



69 



70 



71 



72 



73 



74 



75 



76 



78 



81 



C LUSTER 
3 7341 I 
38120 



84 



85 



86 



87 



88 



89 



90 



91 



92 



93 



94 



375730 



428959 



376851 



376168 
18653 



187632 



SEQ NAME _ 
RTA00002672F.g.l3.2.PTslq 
RTA00002712F.i.l4.I.P.1eq 



RTA00002673F.1.13.2.P.Seq 



RTA00002667F.h. 1 5. 1 PSeq 



RTA00002677F.c.03.2.P.Seq 



RTA0000267 1 F.d. 1 4.2.P.Seq 



RTA00002675F.n. 1 7. 1 .P.Seq 
RTA000027 12F.O.08. 1 .P.Seq 



37412: 



374946 



375666 



21480 



18560 



RTA00002664F.i. 1 5. 1 . P.Seq 



ORIENTATION 
F 
F 
F 



RTA00002673F.I.22.1. P.Seq 



RTA00002673Fi.24. 1 .P.Seq 



RTA00002677F.n. 1 6.2.P.Seq 



RTA00002713F.d.24.1 .P.Seq 



RTA00002709F.c.l8.2.P.Seq 



RTA000027 1 1 F.e.20. 1 .PSeq 



96575 



446747 



3793 1 1 



37931 



124549 



449785 



375134 



186595 



44983 1 



79678 



20599 



41 1 15 



21 109 



455702 



380643 



RTA00002663Fi.08.1. P.Seq 



RTA00002682F.f. 1 8. 1 .P Seq 



F 
F 



RTA00002689F <lt6.3.P.Seq 



RTA00002682F.g.O 1 . 1 .P.Seq 



RTA00002682F.r.24.I.P.Seq 



RTA000027 13F.C.07.1 P.Seq 



RTA0000269IF.C.Q7.3.P Sec 



_F_ 
F 



CLONE ID 
M00039004B:A06 
M00026927D:F02 



M00039612B:G05 



M0003281 1B.D02 



LIBRARY 

CH09LNL~ 

CH04MAL 



CH09LNL 



CH08LNH 



M00039341C.H1I 



M00038272A:G01 
M00039253B-.E06 
M0QQ27135A:bT~ 



M00027617B:C12 



M00039104D:C09 



M00039096A:E07 



M00039422D:F04 



M00027292D:F10 
M00005531D:F06 



M00022938B:F07 



M00022641C.H05 



M00039975C:C1 1 
M00042740A:E09 
~M00039976D.A12 



M00039976D:A12 



mooo: 



7C:B0S 



RTA0Q0O2673F.k.22.2.P.Seq 



RTA000027 13F.n. 15.1. P.Seq 



RTA0000269IF. a. 17.3. P.Seq 



RTA00002676F.b.06.1. P.Seq 



RTA00002708F.h.06. 1 .P.Seq 



RTA000027 1 3F.o. 1 1 . 1 P.Seq 



RTA00002708F.h. 12.1 PSeq 



RTA00002694F.b. U. 1 P.Seq 



374413 



379374 



17253 



'1565 



373996 



380437 



430729 



376791 



373760 



373837 



376435 



373SSI 



377086 



377889 
380442 



RTA00OO2683F.P.09.2. P.Seq 



RTA00002672F.i. 15.2. P.Seq 



RTA00002672Fi. 18.2. P.Seq 



RTA00002672F.k.ll.2.P.Seq 



RTA00002709F.h.23. 1 P.Seq 



RTA0000I709F.e.l 1.1. P.Seq 



RTA00002673F.n.l l.l.P Seq 



RTA00002683F.C.09. 1 .P.Seq 



RTA00002669Fh. 18.2. PSeq 



RTA00002674F.1. 17.1. PSeq 



RTA00002672F.p.20. 1 .P.Seq 



RTA00002672F.p22. 1 .P.Seq 



RTA00002678F.h. 1 7.2. P.Seq 



RTA0000:672F.b 20. 1 .P.Seq 



_F_ 

F 



_F_ 
F 



RTA0000:676F.p.Q7. 1 P.Seq 



-RTA0000:672F.c.Q8. 1 .P.Seq 



RTA0O0O:634F.b.O5.2.P.Seq 



M00043345CA06 



M00039099A:HQ3 



M00027620D:F1 1 



M00042518D:A06 



M00039274B-.GQ/ 



M00004264B:A05 



CH09LNL 



CH09LNL 



CH09LNL 
CH.04MAL 
CH04MAL 



CH09LNL 



CH09LNL 



CH09LNL 



CH04MAL 



CH02COH 
CH03MAH 



CH03MAH 



CH09LNL 



CH15CON 



CH09LNL 



CH09LNL 



CH04MAL 



CH17COHLV 



CH09LNI 



CH04MAL 



CH17COHLV 
CH09LNL 



CHOICOH 



M00027652B:F1 i 



M00004278A.F09 



M00043433C:G07 
M00040103B.H10 



M000390I5B:G10 



M00039016A:A02 
M00039023C:B1 1_ 



M00006866A:D07 



M00005778B.F09 



CH04MAL 



CHOICOH 



CH20COHLV 



CH09LNL 



CHQQLNL 



CH09LNL 



CHO^LNL 



CH02COH 



CH02COH 



M000:-9103D:B06 



M00040039D:D06 



M00033226A:A1 I 



M00039166B:G06 



M00039049D:G07 



M00039050A:H10 



M00039476B:A02 



M000.-3638D:H03 



M0O039328D:DO7 



M0003S661A:A07 



M000401 1 1CD05 



CH09LNL 



CH09LNL 



CHOSLNH 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CHO^LN'L 



CH09LNL 



CHO^LNL 



10 
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SEQ 
ID 
95 
96 



C LUSTER 
374689 
375339 



97 



98 



99 



100 
101 



102 



103 
104 



105 



106 
107 



108 



1 10 



111 



u: 



113 



1 14 



115 



116 



117 



118 



119 



120 



121 



122 



12: 



124 



125 



126 



129 



130 



132 



134 



135 



136 



137 



138 



139 



140 



141 



14197 



380666 



377352 



379188 
428269 



373464 



15527 
377504 



SEQ NAM E 
RTA00002676F.m.l3.2.P.Seq 
RTA00002678F.m.23.2.P.Seg 



ORJ ENTATIONI 
F_ 
F 



RTA000027 I OF. f. 1 5. 1 - P.Seq 



RTA00002684F.C. 04.2. P.Seq 



RTA00002677F.i.l3.2.P.Seq 



RTA00002682F.a.03. 1 P.Seq 
RTA00002666F.C.13. 1 .P.Seq 



RTA0000267IF.I.13.3. P.Seq 



RTA000027 lQF.p.07. 1 .P.Seq 
RTA0000267 1 F.i. l7.3.plSeq" 



33508 



129179 
377086 



375872 



375652 



374266 



378983 



.377343 



378679 



374095 



375843 



RTA000027 IQF.g. 1 7. 1 .P.Seq 



RTA00002662F.d. l9.2.P.Seq 
RTA00002676F.p.07.2.P.Seq 



RTA00002675F.h. 1 5. 1 P.Seq 



CLONE ID 
M000393I3B:B09 
M000396I6A.BIO 



M0002 2084P.B0I 
n 



M000401 15B:H1 



M00039404B.A05 



_F_ 
_F_ 
F 



M00039914D:GI2 
M00032539B:C1 1 



LIBRARY 
CH09LNL 
CHO^LNL 



CH03MAH 



CH09LNL 



CH09LNL 



M00038327A.CI 1 



_F_ 
F 



RTA00002676F.i.07.3. P.Seq 



RTA0O0Q2674F.i.O8.2. P.Seq 



RTA0O0O2682F.a.07. 1 .P.Seq 



RTA00002684F.g.Q4. 1 .P.Seq 



RTA0000263 1 F.f. 16.2.P.Seq 



RTA0000267 1 F.p.08.2.P.Seq 



RTA0000267 1 F.o.06.2. P.Seq 



377788 



21403 



23184 



15671 



177367 



377788 



375058 
380412 



178447 



376647 



44679 



377659 



379703 



374673 



2051 



376124 



RTA00002684F.h.OI.2.P.Seq 



RTA00002709F. j.05. 1 .P.Seq 



RTA00002709F.b05.2P.Sc 



RTA000027 1 OF. k. 1 6. 1 P.Seq 



RTA00002663F.m.22.1 .P.Seq 



RTA00002684F.g.24.1. P.Seq 



RTA00002675F.h.02.1. P.Seq 



RTA00002630F.k. 1 5.2. P.Seq 



RTA00002663F.n.06.1. P.Seq 



RTA00002674F.h.07.1 .P.Seq 



RTA0000266 1 F.e. 19. 1 .P.Seq 



RTA00002678F. a. 04.2. P.Seq 



RTA00002682F.h.Q3. 1 P.Seq 



RTA00002673F.e.08.2.P.Seq 



RTA00002710F.i.l2.1. P.Seq 



RTA00002682F.n. 1 7. 1 P.Seq 



374679 



184 



430953 



380442 



12374 



427466 



36611 



33756 



456026 



15766 



RTA00002676F.d.07.2. P.Seq 



RTAO0002709F.b.O5. 1 .P.Seq 



RTA00002668F.i.23. 1 .P.Seq 



RTA00002684F.b.Q5. 1 .P.Seq 



RTA0000:709F.a.0 1 1 P.Seq 



RTA00002665F.b.l l.:v P Seq 



RTAOOOO:668F.f.03.I.PSeq 



RTA00002662F.a. 18.2. P.Seq 



RTA00002694F.e.03.1. P.Seq 



RTA000027 1 OF.k.02. 1 .P.Seq 



1| 



M00022747D:E03 
M0003S303CD02 



M00022133B.C02 



CH09LNL 
CH08LNH 



CH09LNL 
CH03MAH 
CH09LNL 



M00007157C:F1 1 
M00039328D:D07 



M00039233A:A03 



M00039303C:F1 1 



M00039I44C:E06 



M00039915D:C1 1 



M00040302C:A04 



M00039369B:F06 



M00038618C:C08 



M00038614CHU 



M0004O305C:H06 



M00006928D:D07 
M00005353B:B06 



M00022495D:H08 



M00022986D:H09 



M00040305C:H06 



M00039230D.GI2 



CH03MAH 



CH02COH 
CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CHOQLNL 



CH09LNL 



CH02COH 



CH02COH 



CH03MAH 



CH03MAH 



CH09LNL 



M000398I6B:D04 



M00023007A:H04 



CK09LNL 



CH09LNL 



CH03MAH 



M00039140D-D09 



M00003800A:F09 



M0003°430B.F12 



M00039982CH04 



M00039068B:B04 



M0002239ID:F10 



M0004002IA:F09 



CHQ9LNL 



CHOICOH 



CH0<5LNL 



CH09LNL 



CHO Q LNL 



CK03MAH 



CH09LNL 



M0003928ID:B04 



M00005358B:306 



M00033007C:E01 



M000401 1 1CD05 



M00004325D:D05 
M0002S184D:G10 
" M00032942D:C12 



M00005359A:D04 



M00043616C:A05 



CHCLNL 



CK02COH 



CHOSLSH 



CK09LNL 



CH02COH 



CHOSLNH 



CK08LNH 



CK02COH 



M00022444D:C01 



CH20COHLV 



CH03MAH 



PCT/US00/18374 

WO 01/02568 




PCT/US00/18374 

WO 01/02568 



CLUSTER 
376182 
374797 



375389 
397115 



SEP NAME 
RTA00002677F.b. 1 7.2.P.Seq 
RTA00002673F.b. 12.2-P.S^" 
RTA00002674F.a. 13.1 . PSeq 



ORJENTATION 
F 
F 



CLONE ID 
M00039340B:E07 
M00039444C:H02 



M00039120C:C09 



LIBRARY 
CH09LNL 
CH09LNL 



CH09LNL 



RTA00002683F.i.22.2.P.Seq 



M00040076C:D06 



193 



194 



195 



196 



197 



198 



199 



200 



201 



202 



203 



204 



205 



206 



207 



208 



209 



210 



211 



212 



213 



214 



215 



216 



217 



218 



219 



220 



221 



224 



225 



226 



227 



228 



229 



230 



231 



234 



186655 



RTA000027 1 2F.i.21 . 1. P.Seq 



M0002694 1 D:A04 



404682 



RTA00002687F.b. 13. 1 .PSeq 



19609 



RTA00002709Fx.05.2.P.Seq 



404682 



RTA00002687F.b.l3.2.P.Seq 



380412 



RTA00002680F.k. 1 5. 1 .P.Seq 



394413 



RTA00002689F.d.l7.3.P.Seq 



380086 



RTA00002679F.m. 16.1. P.Seq 



430738 



RTA00002669F.i. 1 5.2.P.Seq 



40667 



RTA000027 1 2F.S-22. 1 .P Seq 



397421 



RTA0000268 1 F.c. 1 6.2. P.Seq 



398775 



RTA00002679F.f.l 1.1. P.Seq 



87345 



RTA000027 1 2F.f. 19. 1 .P.Seq 



379920 



RTA00002679F.b.24.2. P.Seq 



380666 



RTA00002684F.C.04. 1 .P.Seq 



404340 



RTA00002637F.b.05.2. P.Seq 



375509 



RTA0O0O2630F.e.O8.2. P.Seq 



46423 



401713 



RTA0O0027 1 2F.i.02. 1 .P.Seq 
RTA00002685F.p.lQ 



.P.Seq 



377003 



378891 



412778 



373786 



378692 



8SS83 



358187 



377131 



21488 



447487 



364 



404024 



152305 



106050 



41 126 



113496 



447487 



146335 



376647 



376746 



373523 



455466 



374031 



373997 



455717 



RTA00002633F.g.09. 1 P.Seq 



RTAQ0002672F.1. 1 8. 1 .PSeq 



RTA00002685F.i.07.2. P.Seq 



RTA00002679F.a.20.2.P.Seq 



RTA00002680F.Q.20.2. P.Seq 



RTA000027 1 3F.F.22. 1 .P.Seq 



RTA00002676F.b.04.2. P.Seq 



RTAO0002682F.e. 10. 1 -PSeq 



RTA00002703F.f. 17. 1 P.Seq 
RTA00002639F.e.04.3.P^Seq 



RTA000027 lOF.a.06. 1 P.Seq 



RTA00002687F.e. 1 3. 2. P.Seq 



RTA000027 l2F.d.02. 1 .P.Seq 



RTA000027 1 3F.Q. 1 7. 1 .P.Seq 



RTAO0OO2713F. 1. 12.1. P.Seq 



RTA000027l3F.n.20.1.P.Seq 



RTA00002689F.e.04. 1 PSeq 



RTA000027l2F.j.l7.1PSeq 



RTA00002674F.h.07.2. P.Seq 



RTA00002674F f. 1 2. 2. P.Seq 



RTA00002674F.n.21.2.P.Seq 



RTA00002694F.C. 10. 1 P.Seq 



RTA00002683F.P, 1 7.2. P.Seq 



■RTA00002673F.m.04.2.P.Seq 



RTA00002694F.a.Q6. 1 P.Seq 



_F_ 
F 



M00039766D:H01 



CH14EDT 



M00005457C:A03 



CH02COH 



M00039766D:H01 



CH14EDT 



M00039816B.D04 



CH09LNL 



M00042742D:D05 



CHI SCON 



M00039710CG03 



CH09LNL 



M00033231D:B09 
M00026882D:G09 



CH08LNH 



CH04MAL 



M00039854B:F09 



CH09LNL 



M00039675D.H05 



M00026850D:F09 



M00039660C:C10 



M000401 15B:H12 



M00039764C:D07 



M00039790B:D03 



M00026qi4A:H'lQ 



M00039647A:H1 1 



M00040062B:B05 



M000390I6A:A02 



CH09LNL 



CH04MAL 



CH09LNL 



CH09LNL 



CH14EDT 



CH09LNL 



CH04MAL 



CH12EDT 



CH09LNL 



CH09LNL 



M00039533D:F04 
M00039655CC07 



M00039835A:F07 



CH12EDT 



CH09LNL 



CH09LNL 



M00027355A:B07 



M00039273D:B02 



M00039938CEI 



M00004152A:C1 



M00042895A:D10 



M00007929CB08 



M00039958A:A08 



M00023376B:G04 



M00027668C:H12 



M00027546C:B10 



M0O0:7625A.H01 



M00042895A:D10 



M00026980A:D09 



M00039140D.D09 



M0003^I33B:F08 



M00039177B.D03 



M00043461D:E06 



M00040I05C:FH 



M00039I05C:B08 



M00042593C:G06 



CH04MAL 



CH09LNL 
CH09LNL 



CH01COH 



CH15CON 



CH03MAH 



CH14EDT 



•CH04MAL 



CH04MAL 



CH04MAL 



CH04MAL 



CH15CON 



CH04MAL 



CH09LNL 



CH09LNL 



CH09LNL 



CHZOCOHLV 



CH09LNL 



CH09LNL 
CHZOCOHLV 



13 



PCT/US00/18374 

WO 01/02568 



SEQ 
ID 


CLUSTER 


SEQ NAME < 


3 MENTATION 


CLONE ID 
MOO039O5OA:H10 


LIBRARY 
CH09LNL 


236 
237 


373837 
374513 


RTA00002672F.p.22.2. P.Seq 
RTA00002672F.1. 16. 2. P.Seq 




M000390I5B:H09 


CH09LNL 


238 

239 

240 

241 

242 

243 

244 

245 

246 

247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 

259 

260 

261 

262 

263 

264 

265 

266 

267 

268 

269 

270 

271 

272 

273 

274 

275 

276 

277 

278 

279 

280 

281 

282 


375623 
377732 
378326 
378001 
378459 
373862 
373252 
378475 
379941 
427703 
373976 
431643 
383502 
378764 
431629 
372992 
431601 
21059 
430689 
1 3 1 764 
373300 
384601 
375389 
15248 
428134 
374134 
136225 
401713 
27104 
207466 
143045 
378830 
21731 
428552 
187632 
431053 
183972 
430673 
374042 
24332 
376764 
134338 
375541 
228909 
| 58063 


RTA00002672F.k.04.2.P.Seq 
RTA0000268 1 F.p.09. 1 .P.Seq 
RTA0000268lF.m.l 1.1. PSeq 
RTA0000268 1 F.m.22. 1 .PSeq 
RTA0000268 1 F.i.07.2.P.Seq 
RTA0000267 1 F.2.0 1 .2.P.Seq 
RTA00002670F.k. 1 6. 1 .P.Seq 
RTA00002672F.g.24. 1 .P.Seq 
RTA00002682F.j. 15.1 .PSeq 
RTA00002665F.e.l I. I.P.Seq 
RTA0000267 1 F.p. 1 3.2.P.Seq 
RTA00002669F.1.16.3,P.Seq 
RTA0OO0267OF.k.O7. I.P.Seq 
RTA0000268 1 F.j.04. 1 P.Seq 
RTA00002669F.1. 1 4. 3. P.Seq 
RTA00002671F.b. 16.2. P.Seq 
RTA00002669F.k.08.3. P.Seq 
RTA000027 1 OF.c.Od 1 P.Seq 
RTA00002669F. 1.24.3. P.Seq 
RTA00002662F.C. 1 4. 1 .P.Seq 
RTA00002674F.C.2 1 .2. P.Seq 
RTA00002670F.k.06. 1 .P.Seq 
RTA00002674F.a. 1 3.2.P.Seq 
RTA000027 1 OF. t.23. 1 .P.Seq 
RTA00002666F.C. 1 5.1 .P.Seq 
RTA00002672F.a. 1 9. 1 .P.Seq 
RTA00002676F.n.02.2.P.Seq 
RTA00002685F.p. 10. 1 .P.Seq 
RTA0000266 IF. a. 09. 1 .P.Seq 
RTA00002664F.j.08.2.P.Seq 
RTA00002663F.a.02. 1 .P.Seq[ 
RTA00002675F.e.07. 1 P.Seq 
RTA00002709F.k.07. I.P.Seq 
RTA00002666F.C. 1 6. 1 .P.Seq 
RTA00002664F.i. 1 5.2.P.Seq 
RTA00002668F.o.0?.2.P.Seq 
RTA00002664F.d.20. 1 P.beq 
RTA0000266SF.H. 12. 1 P.Seq 
RTA00002672F.a.OS. 1 .P.Seq 
RTA00002709FJ.07. 1 .P.Seq 
RTA00002674F f.20. 1 .P.Seq 
RTA0000"662F.c.l5.2.P.Seq 
RTA00002680F.d.21 .2. P.Seq 
RTA00002664F.e.08.2.P.Seq 
RTA00002661F.h. IS. I.P.Seq 


F 


M00039026D:F05 

M00039910C:GIO 

M00039896C:H01 

M00039898D:C06 

M00039879D:Bl 1 

M00038284B:H04 

M00033451A:HOI 

M00039006D:B0l 

M00039990C:D10 

M00028357A:G10 

M0003S619B:A03 

M00033276D.H09 

M00033446D:B02 

M00039884A:H1 1 

M00033276B:G08 

M00033594C:B03 

M00033263B:G04 

M0000S053A:F10 

M00033243B.A05 

M00006893C:E07 

M00039126D:A08 

M00033446CG08 

M00039120C:C09 

M00022127C:HO3 

M000:-2540A:A09 

M00O33633A:DO7 

M000393I9C:A04 

M00039647A:H 1 1 

M00001363D:D09 

M00027733A:A02 

M0000794 1 D:C09 

M00039221A:H03 

M00007013A:D09 

M00032541D:H08 

M0002 7 617B:C12 

M00033130B:F06 

M00027030CH06 

M00032994A:A08 

M00038631C:BI0 

M00006935C:F06 

M000.-9I35D:F05 

M00006897A:H02 

M0003978SA:E03 

M0002~085C:E1 1 

M00004234A:E07 


CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH09LNL 

CH08LNH 

CH09LNL 

CHOSLNH 

CH09LNL 

CH09LNL 

CHOSLNH 

CH09LNL 

CHOSLNH 

CH03.V1AH 

CHOSLNH 

CH02COH 

CH09LNL 

CH09LNL 

CH09LNL 

CH03MAH 

CHOSLNH 

CH09LNL 

CH09LNL 

CHI2EDT 

CHOICOH 

CH04MAL 

CH03MAH 

CHO°LNL 

CH02COH 

j CHOSLNH 

CH04MAL 
CHOSLNH 
CH04MAL 
CHOSLNH 
CH0°LNL 
CH02COH 
CHCLNL 
CHOICOH 
CH0°LNL 
CH04MAL 
CHOICOH 



WO 01/02568 



PCT/US00/18374 



in 


LLUo 1 fcK. 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


283 


380500 


DTAnnnfPfiTOF d 19 1 P Sea 


F 


M00033533B:E06 


CH09LNL 


284 


34923 


ot Annno"*? l OF o ~* I I P Sea 


F 


M00022795B:G06 


CH03MAH 


285 


374028 


o-TAnnnn*?fi74F k 03 "> P Sea 


F 


M00039156A:81 1 


CH09LNL 


286 


374121 


DTAnnnO'>fi7* ) F h ~> P Sea 

K 1 rtUUUU-O / _r.ii.— — — i . Jvii 


F 


M00039O13A:C09 


CH09LNL 


287 


429547 


n-r AnnfifPfi68F c 07 I P Sea 


F 


M000329J7D.G09 


CH08LNH 


288 


380668 


dta nnrtfyfwOF n 11 7 P Sea 


F 


M00033581C:H10 


CH09LNL 


289 


258704 


t\ L AUUUU— OO J • .III - wvj. I . r . .j^if 


F 


M00032480B:EIO 


CH08LNH 


290 


380325 


pta nflflfp/; 7ftF n " > " > 7 P Sea 


F 


M00033583D:B05 


CH09LNL 


291 


378326 


DTAnnnn''ftR l F m 11 7 P Sea 

|\ | /\UUUU— OO 1 r .in. i i * . jtvj 


F 


M00039896C:H0I 


CH09LNL 


292 


375618 


DTAnnnm^^F H l> 1 P Sea 


F 


M00039218A:F03 


CH09LNL 


293 


20999 


K 1 AUUUU— /U"r.J. ID. I .r.jcn 


F 


M00006977C:G04 


CH02COH 


294 


29102 


r> t \ nflfin07 |HP n IS I P Sen 
R 1 AUUUUZ / lUr .p. I 0. 1 .r 


p 


M00022793D:B01 


CH03MAH 


295 


379334 


K 1 AUUUU—OoUr.D.— • 1 .r.^c^ 


F 


M00039778C:A04 


CH09LNL 


296 


23943 


K 1 AUUUU.i /Ujr.i.i — i -1 . jcij 


F 


M00006886D:H02 


CH02COH 


297 


373998 


R rAUUUU-0/_r.a. iu.-£.r .ocq 


F 


M0003863 1 D:B02 


CH09LNL 


298 


373325 


RTA000U-O / -r -C. i-k-.r.ocq 


r 
r 


M00038662B:A12 


CH09LNL 


299 


373818 


RTAOOuUJo /-r .e. 1 j.-.r.oeq 


F 


M00038995C:G08 


CH09LNL 


300 


429843 


RTAOOOO-OOiSr .c. 1 U. I .r.oeq 


p 
r 


M000329 1 SB:E06 


CH08LNH 


301 


427755 


RTAOOUU-OO^r .(J. 1 V.^.r.oeq 


F 


M0002S3 16B:H12 


CH08LNH 


302 


1 89 1 77 


RTA000O^oo4r -C— j.-i.r.ocq 


c 

r 


M00026922C:G03 


CH04MAL 


303 


13294 


r»-r* « r\nnr\T7nGE" i t t P 
RTA0UUU\i /UVr -J. 1 j. 1 .r .oeq 


F 


M00006968A:G08 


CH02COH 


304 


178801 


RTAOUUUzooj r.n.Ul.l .r.ocq 


F 


M00022997A:F06 


CH03MAH 


305 


230865 


R 1 AUUUU _00-+r .Q.U J — r.oeq 


F 


M00026923D:A03 


CH04MAL 


306 


178801 


RTAOUOU-oojr.m — 4. 1 .r.oeq 


F 

r 


M00022 C >^7A:F06 


CH03MAH 


307 


378809 


dta ftnnni/»7'> F 0 7 1 7 p Sen 
R 1 AUUUU-O /_r.g.- 1 — r.oeq 


F 


M00039005C:H01 


CH09LNL 


308 


378957 


OT \ nnnn,7 A70.F H 1 7 P Sen 
K 1 AUUUU-O / ur .0. 1 / .n.ocq 


F 


M00033362C:C05 


CH09LNL 


309 


373523 


dta nnnn7£7.d.F n ^ 1 IP Sen 
]\ \ AUvuU-D / -+ r .11.— t.i.r .jci^ 


F 


M00039177B:D0J 


CH09LNL 


310 


375458 


dt \ ni*\An,7 A7*JF 1 OA 7 P ^en 
R I AUUUU— 0 / or . I.UO.— .r .pcq 


F 


M0003961 i D:D1 1 


CH09LNL 


3 1 1 


429794 




F 


M000329|8B:D08 


CH08LNH 


312 


72797 


dt \ nf\n.n,7/£/C 1 F .» fi7 1 P Sen 
R I AUUUU— OO 1 r .CU / . I .r. ocq 


F 


M00003"61C:F02 


CH0ICOH 


313 


429992 


DTAnnnn^AARF r "'I 1 P Sea 


F 


M00032921 B:H08 


CH03LNH 


3 14 


374410 


dt \ nnnm^TJF W 1 1 7 P Sea 


F 


M00039158B:GI2 


CH09LNL 


315 


376553 


dt Annnn7fi7tiF a 10 1 P Sea 


F 


M00039I39A:C09 


CH09LNL 


316 


429096 


d t \ nn^mA^AF f ifi 1 P Sen 
K l AUUUU-Ouor . 1. 1 o. i . r . jclj 


F 


M00032573A:G06 


CH08LNH 


317 


IS 1948 


DTAfifinnT^'iF n 0"^ 1 P Sea 


F 


M00023OO3C:DO7 


CH03MAH 


318 


378475 


pTAnnon''( ! i7"'F h 01 ** P Sea 

[\ 1 AUUUU-U ' — I .11. V 1 . — .1 .J»-^ 


F 


M00039006D:B0I 


CH09LNL 


3 19 


427336 


dta nnno~ , 66^F c 1 P Sea 


F 


M00028210B:D02 


CH03LNH 


320 


374042 


ota fifinfPfi7^F a 08 P Sea 


F 


M0003S63 IC:B10 


CH09LNL 


321 


386543 


RTA00002672F.f. 1 3.2.P.Seq 


F 


M00038^ C >9B:G1 1 


CH09LNL 


322 


376659 


RTA00002678F.h.l 1.2.P.Seq 


F 


M0003 l >475C:E10 


CH09LNL 


323 


29135 


RTA00002663 F.c.09. 1 .P.Seq 


F 


MO0O21' : >23C:DI 1 


CH03MAH 


324. 


377967 


RTA0000268 1 F.m. 1 7. 2. P.Seq 


F 


M00039SO7D:Cl0 


CH09LNL 


325 


431330 


RTA00002668F.m. 16.2. P.Seq 


F 


M000330~4A:C0S 


CH08LNH 


326 


373824 


RTA00002680F.i. 19.2. P.Seq 


F 


M00039S0SD:H02 


CH09LNL 


327 


50094 


RTA0000266 1 Fj.0:.2.P.Seq 


F 


M000043TSA:BI.O 


CH01COH -1 


328 


214272 


RTA00002664F.h.03.2. P.Seq 


I F 


M00027366A:FI 1 


CH04MAL 


329 


377293 


RTA00002680F.b. 1 7.2. P.Seq 


I F 


M0003^""7C:E05 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



CLUSTER 
195033 
21274 



332 



376580 



374725 



334 



25238 



"» "» 5 



377337 



336 



450485 



21989 



rRTA0D00270»F.h.22.l.P.aeg 



58296 



RTA00002661F.i.20.2.P.beq 



379144 



RTA00002679F.l.l4.1.P^q 



379690 



ftTA00002680F.b.21.2.P.Seq 



341 



379640 



373988 
373988 



K. l .• \w^-* t/wv ' - 

~ RTA0000268TFd7l2.2.P.Seg 
I RTA00002673F.h.23. 1 P beg 
rRTA00002673F.h.23.2.P.beq 



348 



380673 
55243 



RTA00002673F.i.l3.2.P.i>eq" 
RTA0OOQ2661 F. i.06.2.f .beg 



40557 



i P T^n nnn?71JF.h.2l.l.P.beq 



375467 



RTA00O02677F.m.O3. 1 .PSeq 



^OlTA0»679F^^ 



350 



351 



354 



430392 



•RTA00U02663F.k.l9.l.^g 



376746 



115595 



RTA00002674F.f.l2.1.P-beq 
-p- T -i^n no?? ) 3F.e.07. I .P.beg_ 



377182 



p7T7w in^682F.l.ll.l.P^q 



380659 



^Tlm no^634F.e.07.2.P.Seq 



37386 



p T AA MW--6-lF.a.01.I.P^q 



355 



35/ 



358 



^-4^AQ^77F:b.l6.2.PSe 3 



Ir^gT tRTAO^l02£702^1i^ 

RTA00002672F. 2.24.2. P.Seq 



378475 



359 



361 



362 



363 



427336 



RTA00OO2665F.c.23.3.P.S«;q 



373814 i n. i - — 



RTA00002672F.b.02.2.Pi)eq 



-:. nrn - P -i^O:673F.c.07.2.P.i.eq 



-^y-JoTT^n02667F.c. 13.1 .Pl^ 



J ' I -nl~ Ctn ~> O 1.»r1 



375154 



RTA00002676F.C. 1 3 .2.P Seq 



431214 



376053 



-p-r Ann ncr (>69F.k.04. 1 .H.beq 
! RTA^OOOp^j^OJJJl^a 



369 
370 



372 



373 



374 



375 



376 



373282 



-RTAW02680F.i.l9.2.P.xq 



•pTTn nno^66 1 F.h.04. \.v.$eg 



- , 7 „ 192 j R 1 A 0000:63 1 F.i.09.2TP.Seq 



431612 



37847 



'RTA00002669F.e.23.3.P.Seo 
RTA0000-O 79Fo 17. 1 PSeq 



CLONE 1D_ 
M0002304JB' D02 
M00007l9aA:B09 
M000392i2C:Cl2 



LIBRARY^ 

chosmaIi 

CH02COH~ 



CH09LNL 



CH09LNL 



M0000686IBFO9 



M00004354D:E05 



j / o-+ ' . i • ' • ■ . _ 

5?ii ^-jRTA00^^.^-.PS^ 



374894 



R^Xoooo;675F.r.04.I.K.seq" 



430191 



'RTA00002667F.i.24.1.PSeq 



M00039705D:F02_ 
M00039773BG03 



CH02COH 



CH01COH 



CH09LNL 



CH09LNL 



Mn0039859C.G10 



M00039079A:A05 



M00039079A:A05 



M000:-9084C:H03 
M0000 423:D:C1 1 



M00027398C:F07_ 



M0Q0:^417A:D03 



M0003 C '639C-.E08 



CH09LNL 



CH09LNL 



CH09LNL _ 



CH09LNL 



CH01COH 



CH04MAL 
CH09LNL_ 



CH09LNL 



M0003303TD-.CU 



M00039133B.F08 



M000:7297A:CQ4_ 



M00040010A:F10 
M000-:0124D.HOI 



CH08LNH 



CH09LNL 



CH04MAL 



CH09LNL 



CH09LNL 



MOOOlili^HWirCHO^ 



M00039340A:DO: 



MOOQ33353A:H12 
M00039006O:B01 



M00023210BD02 



M00038635A:G09 



_M00O27433C:GO7 
MO0039O53CH0; 



M00032744B:F10_ 



M00039273B-.F02 



M00039465A:A08_ 
"M0003^- 79B HQ2 " 



M000:-3:fa2D.All 



M00039:49A:C12 



CH09LNL 



CH09LNL_ 



CH09LNL 



CH03LNH 



CH09LNL 



CH04MAL_ 



CH09LNL 



CH03LNH 



CH09LNL 



CH09LNL 



CH09LNL_ 



CH03LNH 



CH09LNL. 



M000393I3B:DH.. 
M00004163A:G11 



M000392i:-B:F05 _ 



M00039330A:H1 1 



CH09LNL 



CH01COH 



M000?3:02D:G06 
M00O39T2-C.309 



M00039379C:F05 



M00039:24A:E12 



M0003:3:93:E06 



CH09LNL_ 
CH09LNL 



CH08LNH_ 



CH09LNL 



CH09LNL 



CH09LNL 



CHOSLNH 



1 



WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



CLUSTER 



SEQ NAME 



ORIENTATION 



CLONE ID 



LIBRARY 



471 



59077 



RTA000027 1 3 F.n.0 1 . 1 .P.Seq 



M00027596C.E06 
M0000S006B:B03~ 



472 



473 
474 



475 



1935 



RTA00002710F.b.l 1.1. P.Seq 



CH03MAH 



379684 
451564 



RTA0000268 1 F.c.09. 1 .P.Seq 



RTA0000269 IF.f.l 2.2. P.Seq 



M0003985IB:GI 1 
M000434I ID:H06 



CH09LNL 



CH17COHLV 



7571 



RTA000027 lOF.a. 1 5. 1 .P.Seq 



M00007943D:C09 



CH03MAH 



RTA00002713F.k.21 . 1 .P.Seq 



M00027525B:D06 



CH04MAL 



477 



478 
479 



12960 



RTA000027 1 0F.a.23 . 1 . P.Seq 



M00007976A:C10 



RTA000027 1 3 F.o.05 . 1 . P.Seq 



M00027641C:A03 



CH04MAL 



480 



481 



482 
483 
484 



59077 



RTA000027 1 3 F.m.24. 1 . P.Seq 



185884 



RTA00002712F.b.06.1.P.Seq 



19471 



RTA00002708F.g.08. 1 .P.Seq 



45206 
404257 



RTA000027 10F.C.06. 1 .P.Seq 



RTA00002687F.g.06.2. P.Seq 



M00027596C:E06 



M000233I6CG08 



CH04MAL 



M00004197B.H10 



CH01COH 



M00008063B:A06 



CH03MAH 



M00040208A:C03 



CH14EDT 



485 



486 



487 



438 



489 



490 



491 



492 



493 
494 



495 



496 



497 



498 



499 
500 
501 



502 



503 



504 
505 
506 



507 



508 



509 



510 



511 



51: 



3Ij 



514 
515 



516 



517 



372997 



RTA00002679F.p.Q4. 1. P.Seq 



M00039729A:A10 



43792 



RTA000027 1 3 F.k. 16. 1 .P.Seq 



M00027520A:C05 



CH04MAL 



400052 



RTA00002687F.h.l3.2.P.Seq 



M0004029ID:C05 



CH14EDT 



452194 



RTA00002692F.C. 14.2. P.Seq 



M00042988A:F06 



CHI SCON 



24034 



RTA000027 1 OF.b.06. 1 .P.Seq 



M00007992C:F06 



CH03MAH 



447544 



RTA00002689F.e. 1 8. 1 .P.Seq 



401872 



RTA00002686F.C.23. 1 .P.Seq 



376553 



455051 



16760 



374174 



374283 



375772 



376417 



423971 



394098 
379761 
374266 



372946 



228909 



427524 
380413 
373366 



427202 



373000 



378838 



24945 



20277 



20820 



376791 



9309 
429562 



12920 



377565 



RTA00002674F.i;.l9.2.P.Seq 



RTAOOQ02694F.a.07. 1 .P.Seq 



RTA00O027O8F.i.Q3. 1 .P.Seq 



RTA00002672 F.i. 1 2.2.P.Seq 



RTA00002672F.k.2 1 .2.P.Seq 



RTA0000268 1 F.o.24. 1 .P.Seq 



RTA00002673F.i.03.2. P.Seq 



RTA00002666F.Q.02. 1 .P.Seq 



RTA0000268 1 F.j. 1 5. 1 .P.Seq 



RTA00OO2670F n.O3. 1 .P.Seq 
RTA00002674F.i.03. 1 .P.Seq 



RTA00002670F.I.07. 1 .P.Seq 



RTA0QQ02664F.e.08. 1 .P.Seq 



RTA00002665F.e.05. 1 .P.Seq 



RTA00002630F.k. 19.2. P.Seq 



RTAO00O267 1 F.c.24.2. P.Seq 



RTA00002665F a. 15. 1 P Seq 



RTA00002670F.j. 13. 1 .P.Seq 



RTA0000267SF.p. 1 1 .2. P.Seq 



RTA00002 7 1 OF p. 05. 1. P.Seq 



RTA000027IQF.e.l7.I.P.Seq 



RTA000027 IQF.e.02. 1 P.Seq 



RTA00002674F. 1. 17.2. P.Seq 



RTA00002710F a. 



P.Seq 



RTA00002667F.m.Q3. 1 P.Seq 



RTA00002710F.e. 15.1. P.Seq 



RTA 00002684 F.h. 1 9. 1. P.Seq 



M00042905D:D02 



CH15CON 



M00O40141D:FO5 



CH13EDT 



M00039I39A:C09 



CH09LNL 



M00042595A:A1 I 



CH20COHLV 



M00004393B:E07 



CH01COH 



M00039015A:D07 



M00039030B:E02 



M00039909C:GO5 
M00039477D:A10 



M000326"SC:D06 



M00039887C:E07 



M00033561C:A02 



M00039144C:E06 



M00033457D:A05 



M00027085C:E1 1 



M00028354D:A03 



M000398I6C:D05 



M00038259C:H09 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH08LNH 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH04MAL 



CHOSLNH 



CH09LNL 



CH09LNL 



M0002S6PC:A1 



M00033437C:C03 



M00039637C:A10 



M0O022739A:B03 



M00021972D:C1 1 



M0002I919C:A10 



M00039166B:G06 



M000:2178B:D06 



M00032853D:G12 



M0002I964C:E10 



CHOSLNH 



CH09LNL 



CH09LNL 



CH03MAH 



CH03MAH 



CH03MAH 



CH09LNL 
CH03MAH 



CHOSLNH 



CH03MAH 



M000-i0309A:El 1 



CH09LNL 



WO 01/02568 



PCT/US00/18374 



CLUSTER ^ 
429356 
427634 



520 



531 



532 



533 



534 



535 



536 



537 



538 



539 



540 



541 



542 
543 



544 



545 



546 



547 



548 



549 



550 



551 



55: 



553 
554 



SEP NAME 

RtA0Q00-668F.d.23. 1 p ~ 
RTAnOQ0266iF.f.09.1 .P.Seq 



ORIENTATION 



555 



427713 



556 



557 



373607 
378781 



429361 
126754 



558 



559 



560 



561 



RTA00OO-66^ Fe - 23lP - Sec '- 
'RTA00002674F.d.l5.2.P.Seq 

"RTA00002674F.O- 14. 1 .P.Seq 



RTA0Q002666F.d.l l.l.P.Seq 
gTAnnOQ2663F.a.l6.I.PSeq 



562 



563 



564 



400265 



380056 



375324 



25165 



401296 



394098 



17430 



373820 
378548 



222679 



376874 



21329 



119905 



377028 



373351 



376082 



376987 



61921 




PTAf)O00268SF.c.03.2.P.Seq 



BTAn0002680F.a.l6.2.P.Seq 



BTAn0002678F.l.l2.2.P.Seq 



RTA0O0027 1 OF.k. 17.1 .P.Seq 



HTAn0002685F.h.23.2.P.Seq 



RTAOO002681F.i. 15.2. P.Seq 



RTA00002710F.i.l l.l.P.Seq 



RTA00O0-674F.d.O6. 1 .P.Seq 



RTA0000-672F.g. 14.2.P.Seq 



RTA00002664F.f.l8.2.P.Seq 



RTA00002670F.e.23.2.P.Seq 



RTA00002709F.b.Q8. 1 .P.Seq 



RTA00QO27 1 OF. p. 1 3. 1 .P.Seq 



RTA00003678F.n.2 1 .2. P.Seq 



RTAOO0O-671F.1. 18.3. P.Seq 



RTAO0002674Fm. 1 7.1 .P.Seq 



RTA0000-678F.g.21.2.P.Seq 



RTA00002661 F.g.08. 1 .P.Seq 



373486 



380355 



430295 



379221 



373532 



375633 



378356 



376196 



375115 



375115 



378600 



375351 



25237 



193503 



428268 
379440 



RTAOOQ02672F.b.03.2.P.Seq 



RTA00002670F.o.06.2.P"Seq 



BT AflOQ02667F.h. 14. 1 .P.Seq 



RTA0Q00-682F.n.0 1 . 1 .P.Seq 



RTA0OO0?672F.d.1O.2.P.Seq 



BTAn0002677F.m.05.2.P.Seq 



RTA00002681F.f.07.I.P.Seq 



BTAn0002674F.m.l2.1. P.Seq 



RTA00002673F.d.24.2. P.Seq 



RTA00002673F.e.OI.2.P.Seq 



RTA00002679F.i.03. 1 P.Seq 



RTA00O02tiS0F.e. 15.1. P.Seq 



RTA000027IOF.n.23.1.PSeq 
RTA0000:663F.n. 1 5. 1. P.Seq 



,RTA0000-667F.b.0 1 1 P-Seq 



RTA0OOO:68;F.i.21.2.P.Seq 



CLONE ID 
'm0O032933A:C1O_ 
M00023369D.E08 



M00028364C:G08 



M00039I27D.E10 
M000391963:H06 



M00032550D:C02 
M00008045A:H02 



LIBRARY 
CH08LNH 
CH08LNH 



CH08LNH 



CH09LNL 
CH09LNL 



CH08LNH 
CH03MAH 



M00039773D:F1 1_ 



M000396123:BI0_ 



M000224963:E12 



M00039529C:D07 



CH09LNL 
CH09LNL 



CH03MAH 



CH12EDT 



M000:-938"C:E07 



M00022365D:A03 



M00039I27A:G1 1 



MQ00;900-iB:Cl 1_ 



M000:7228D.A01_ 



M000;-3375.A:G04 



M00005379A:E04 



M00022785C:G06_ 



M00039631 A:C10 



M00033327D:.A05_ 



M000391713:D11 



M00039472C:B08 



M0000399;3:E03 



M000386353:C08 



M00033570C:C10 



M00032803B:G10 



M00040017D.G03 



M0003399IA:D01 



M000394 1~B:F01 



CH09LNL 



CH03MAH 



CH09LNL 



CH09LNL 



CH04MAL 



CH09LNL 



CH02COH 
CH03MAH 



CH09LNL 



CH09LNL 



CH09LNL 



CH01COH 



CH09LNL 
CH09LNL 



CH08LNH 
CH09LNL 



CH09LNL 



M00039866BA08 



M00039I70C:F05 
M000;9066DG08 



M00039066D:GOS 



M00039636C:E06 



M0003979:a.304 



M0OO2:671B:AQ8 



M00023039D:305 
M0003:724A.CQj 



M00040080C:C06 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL_ 
CH03MAH 



CH03MAH 



CHOSLNH 



CHQ9LNL _ 



WO 01/02568 



PCT/US00/18374 




599 



600 



601 



602 



60: 



604 



60S 
606 



607 



608 



611 



161 116 
37921 1 



RTA000027 14 F.c. 11.1. P.Seq 
RTA00002682F.p.20.1.PSeq 




M0002783"C:D09 
M00040029A:G04 



CH09LNL 
CH08LNH 



RTA[)0002712F.m.21.1.P.Seq 



23030 



372946 



375351 



RTA00002709F.b.l0.2-P.Seq 



RTA00002670F.I-07.2.P.Seq 



374502 



3769 1 1 



376024 



377194 



379643 



79610 



25613 



207466 



400052 



21290 



RTA00002630F.e.l 5.2.P.Seq 



RTA00002673F.i.08.1.PSeq 



RTA00002632F.e.09.1.P.Seq 



BTA00002675F.n.l 5. 1 .P.Seq 



RTAn 0002679F.h.20. 1 .P.S^q 



RTA000026S2F.g.0S. 1 .P.Seq 
RTA00002680F.k.ll.l-PSeq 



RTA0000271 1 F.g.06. 1 .P.Seq 



RTA00002664F.i.03. 1 .P.Seq 



M00003384A:C1 1 



CH02COH 



M00033457D:A03 



CH09LNL 



M0Q039792A:BQ4 



M00039080C.H06 



MO0039933C:AO8_ 



MO0039257D:CO3 



CH09LNL 
CH09LNL 
CH09LNL 
CH09LNL 



M00039635A:A08 



CH09LNL _ 



M00039973A:C03 



CH09LNL 



RTA00002687F.h.l 3.1. PSeq 



RTA000027 1 2F.g.0 1 ■ 1 . P.Seq 



M00039815CF09 



M00023024D:F12 



M000:7733A:A02 



M00Q4029ID:C0: 



CH09LNL 
CH03MAH_ 
CH04MAL _ 
CH14EDT 



MQ00:6859D:D0I 



CH04MAL 



a 



<fcf 



WO 01/02568 



PCT/US00/18374 




631 



632 



633 



634 



635 



636 



637 



638 



639 



640 



641 



642 



643 



644 



645 



646 



647 



64S 



649 
650 
651 
652 
652 
654 



655 
656 



657 
658 



186387 



21093 



20827 



21290 



17646 



4028 17 



42854 



430876 



RTA000027 1 3F.k.24. 1 .P.Seq 



RTA00002708F.h.20. 1 .P.Seq 



RTA000027 lOF.c.23. 1 .P.Seq 



RTA000027 12F.1.24. 1 .P.Seq 



RTA000027 1 0F.d.22. 1 .P.Seq 



RTA00002686F.3. 1 0. 1 .P.Seq 



RTA000027 1 3F.n.09. 1 .P.Seq 



RTA00002669F.C.02.3. P.Seq 



575843 



36165 



456506 



374450 



.78949 



373313 



377861 



431 196 



72795 



42340 



374410 



374623 



431612 



240615 



428508 



235780 



17890 
20100 



4458 
378347 



RTA00002679Fa.2 1 ,2.P.Seq 



RTA00Q02674F.m.0:-.2.P.5eq 



RTA00002703F.i.06. 1 .P.Seq 



RTA00002694F.d.Qj.l. P.Seq 
RTA00002672F.i.05.2.P.S~eq 



RTA00002683F.Q.2 1.2. P.Seq 



RTA0000267 1 F.m.02.2.P.Seq 



M0000430SC:C06 



M00021671D.F12 



M00026859D:D01 



M0002 190SD:G12 



M00039736D:G08 



M00027615A:F10 



M000331S6C:D1I 



M000396?:C:EOS 



RTA0O00268IF.m.20.[.P.Seq 



RTA00002669F.f.07.2.P.s"eq" 
RTA00002683F.a.06. l.P.Seq 



RTA0000266 1 F.b.03 . 1 .P.Seq 
RTA00002674F.k. 11.1 .P.Seq 



RTA00002674F.a.01.:.PSeq 



RTA00002669F.e.23.2.P.Seq 



RTA00002672F.e. 19.1. P.Seq 



RTA00002666F.d.0 1 .1 .P.Seq 



RT A00002666F.d.03 . 1 . P.Seq 



RTA00002710F.e.l 1.1. P.Seq 



RTA000027IOF.g.l 1.1 .P.Seq 



RTA000027 1 OF.g. 1 3. 1 .Pleq 



RTA0000268 1 F.h.07.2. P.Seq 



M00039163CA04 



M00004340CC07 



M000434Q2A:E01 



M00039014A.H10 



CHOICOH 



CH03MAH 



CH04MAL 



CH03MAH 



CH13EDT 



CH04MAL 



CH08LNH 



CH09LNL 



CH09LNL 



CH01COH 



CH20COHLV 



M00040IOOD:B06 



M000383:SD:A03 



M00039893A:A08 



M0003320^B:A07 



M000400:-:A:B03 
M00Q0I4;^C:H06 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH08LNH 



M0003915SB:Gi: 



M00Q391 ISD:A06 



M0003320:D:G06 



M0003399.<D:E05 



M00032545B:H09 



M0003254fD:G05 



M0002 1955AJ-I02_ 



M00022 I'f D:D12 



M000221S^C:C1 ! 



M000393"5D:A10 



CH09LNL 



CHOICOH 



CH09LNL 



CH09LNL 



CH08LNH 



CH09LNL 



CH03LNH 



CH08LNH 



CH03MAH 



CH03MAH 



CH03MAH 



CH09LNL 



WO 01/02568 



PCT/US00/18374 




679 



3^2795 RTA00002683F.a.06.2.P.Seq 



680 



429340 



681 



42982 



682 



375224 



683 



378347 



RTA00002666F.f.l2.1.P.Seq 



RTA00002668F.e.l7.t.P.Seq 



RTA00002680F.d.22.2.P.S^q 



nTAnnOO2631Fh.07.1.PSeq 



684 
685 



686 



380109 



37900 1 



375348 



RTA00002682F-i.l7.1.P.Seq 



BTAn0002633F.o.02.I.P.Sgq 



RTA00002676F.i.l2.3.P.Seq 



RTA00002672F.c.08.2.P.S.iq 




M00032577A:C04 



M00032939B:E07_ 



M000397S3B:A06 



M00039875D:AIQ_ 



M0003998"C:G08_ 



MQ0040Q97A:CI2_ 



M00039304D:B09 



M00O33661A:A07_ 
M00032793A:F06 



CHOSLNH 



CHOSLNH 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL_ 
CHOSLNH 



WO 01/02568 



PCT/US00/18374 



CLUSTER 
374311 



707 



708 



709 



710 
711 



712 



713 



716 



717 



718 



721 



722 



723 



724 



725 



726 



727 



728 



729 



730 



731 



732 



733 



734 



735 



736 



737 



738 



739 



740 



741 



278923 



742 



378667 



743 



380454 



744 



381576 
375067 



745 



746 



747 



89706 



10583 



379982 
378532 



SEQ NAME 
RTAOOO02 676F.e.l8.2.H.beq 



ORIENTATION 
F 



RTA00002667F.b. 1 0 1 P Seq 



BT*nnOQ2681F.b.l l.2.P.Seq 



RTA00002673Fi. 16.1. PSeq 
RTA00002670F.i.Q4.2.P.Seq 
OT AO0002675F.Q.03. 1 .P. Seq 



RTA00002714F.a.l l.l.P Seq 



RTA00002711F.h.ll.l.P.Seq 



379776 



743 



749 



750 



751 



374136 



752 



98471 



125365_ 
375431 



62826 



379972 



377554 



230479 



98872 



42635 



379044 



96093 



403642 



400921 



93587 



7995' 



176509 



RTA00002682F.i.l6.1.P.Seq 
RTA00002680F.n.04.3.P.Seq 



_F_ 
_F_ 
F 



RTA00002680F.a.22.2.P.Seq 



RTA00002673F.f. 16. 1 -P.Seq 



RTA00002663F.i.2l.l.P.Seq 



RTA00002668F.i.07.1. P.Seq 
RTA00002680F.f.03.2.P.Seq 



RTA0000266 1 F.2.20. 1 PSeq 



PT Ann002679F.e. 10. 1 .P.Seq 



CLONE ID 
M000392S7C:AUO 



M00032726C:C0I 



M0003984~A:F06 



M0003908-tD:D0 7 . 
M00033425A:C10 



M00039260C:G03 



M000277413:F09 



M00023I00A:E12 



M00039987C:E12 
M00O39823B:CO5 



M00039774C:A03 



M00039072C.C03 



RTA00002679F.f.l0 .1. P.Seq 



RTA00002664F.C. 16.2.P.Seq 
RTAOn002663F.j. 19.1. P.Seq 



RTAOOO02679F.h.1S.l. P.Seq 



RTA00002679F.a.l0.2.PSeq 



VtO0Q2267OD:Hl 1 



M00033019B:E10 
M00039793D.C05 



LIBRARY 
CH09LNL 



CH08LNH 



CH09LNL 



CH09LNL 
CH09LNL 



CH09LNL 



C H 04 VIAL 



CH03MAH 



CH09LNL 
CH09LNL 



CH09LNL 



CH09LNL 



CH03MAH 



M00004105D:D05 



M00039672D.D10_ 



M00039675D:B03 



M00026915B:C06 



M000226688:B12 



CH08LNH 
CH09LNL 



CH01COH 



CH09LNL 



CH09LNL 
CH04MAL 



M00039634D:B08 



RTA00002663F. j.07.1 .P.Seq 



RTA T)0O02637F.d.01.2.P.Seq 



RTAf)0 002685F.b. 18.2. P.Seq 



RTAn 0OO2663F.k.l0.1. PSeq 



RTAn0002713F.cl3.1. P.Seq 



451753 



RTA00002686F.b.09.1. P.Seq 
RTA00002694F.e.06.I.P.Seq 



186266 

235052 



377233 



3785 



177932 



9332 



240318 



404260 



93767 



185642 



447544 



403274 



404257 



403363 
450074 



404520 



451789 



455173 



RTAO0002713F.C.16.I. P.Seq 



M00039652B:D05 



M00022640C.C1: 



M00039945C:F09 



M00039371B:H06 



M0O022731A:D02 



M000:7253A:A07 



M00039756B:H06_ 



RTAOO0O2692F.a. 15.2. P.Seq 
RTA00002632F.e.23 . 1 . P.Seq 



RTA00002680F.n.04.2.P.Seq 



RTA0O0027 1 3F.b.22. 1 .P-Seq 



PTAOO002712F.p.l3.1.PSeq 



RTA00002637F.d.Q4.2.P.Seq 



RTAOO002637F.C.1 1.2. P.Seq 



RTA00002" 12F.2.09. 1 .P.Seq j 
RTA00002712F.r.20.1. P.Seq 



RTA000026S9F.e.l3.:-. P.Seq 



RTA000026S7F.b.lO.:.PSeq 



RT A0000263~F.g.06. 1 .P.Seq 



RTA00002637F.k.Q5.: P.Seq 



RTA00002691F.e. 12.:. P.Seq 



M00043634A:C10 



M000:72563:HQ9 
M00042626B:D08 _ 



CH03MAH 



CH09LNL 
CH09LNL 



CH03MAH 
CH14EDT _ 



CH12EDT 



CH03MAH 



CH04MAL 



CHI3EDT 



CH20COHLVJ 



M00039940P:GOS 



M0Q039323B:C05 



M00027233B.C01 



MOOO:7I79P.E06 
"M00039947A:UOtj 



CHI SCON 



CH09LNL 



CH09LNL 



CH04MAL 



M00039942D:C01 



MQQ0:6868C:E1 1 



M000-6856P:F02 



M00Q4:905D:D02 



RTA00002687F.r.Q5.:. PSeq 



RTA00002692F.b.04.:. P.Seq 



RTAQ000:694F.b. 1 9- 1 PSeq 



MQ003^766A:GQ7 



M00040:03A:C03 



M0QQ403I3C:K1 1 



M00Q433^:D:C1 1 



M00040202A:F05 



CH04MAL 
CHI4EDT 
CHI4EDT 



CH04MAL 



CH04MAL 
CH15CON 



CH14EDT. 



CHUEDT 



CH14EDT 



M0004:956C:B06 



M000-':-447A. CQ7 



CHI 7COHLVJ 
CHI4EDT~ 



CHI SCON 



CH20COHLV 



WO 01/02568 



PCT/US00/18374 



CLUSTER 
455136 
379001 
374763 



SEQ NAME 
'RTA00002694F.a.08. l.K.Seq 
RTA00002683F.o.02.2.P.Seq 
gTAOOn02673F.p.21.1P.Seq 



756 



402508 



757 



431370 



758 



380500 



759 



376743 



760 



191690 



761 



374264 



RTA0Q002686F.Q. 13. 1 -P-Seq 



PTAnnnO1669F.m.04.3.P.Seq 



RTA00002670F.pl 9. l.P.Seq 
RTA00002678F.e.22.2.P.Seq 



RTA00OO2673F.m. 19. 1 .P.Seq 



ORIENTATION 
F ~ 
F 
F 



RTA0000267 1 F.P-2 1 .2-P.Seq 



373020 
375231 



~DTln nnO2671F.b.20.2.P.Seq 
"RTA0000267 1 F.m.20.2.P.Seq 



CLONE ID 
M00042595A:B01 
M00040097A.C12 



M000391 13B:C05 



M00040281D:B01 



M00033233B:DI2 



MOOQ33583B:E06 



M00039461A:F04 



M00039107C:E04 



M00038620B:E09_ 
M00033595A:C1 
M00038387B:A07 



LIBRARY 
CH20COHLV' 
CH09LNL 



CH09LNL 



CH13EDT 



CH08LNH 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 



CH09LNL 
CH09LNL 
CH02COH 




WO 01/02568 



PCTAJS00/18374 




817 



818 



819 



820 



821 
822 



822 
824 



825 
826 



827 



828 
829 



830 



831 



833 



834 



835 



836 



837 



838 



839 



840 



841 



842 



843 



844 



845 



846 



186359 



404290 



375443 



380279 



3861 10 



380279 



RTA00002713F.g.24.1.P.Seq 



RTA00002688F.e.04.2.P.Seq 



PTAn0002676F.g.l9.2.P.Scq 



386986 



186359 



375611 



378285 



44025 



25240 



403700 
404679 



454806 



376829 



456309 



374510 



377232 



375779 



RTA00002673F.i.24 I .P.Seq 
RTA000026S7F.e.06.1.P.Seq 
RTA00002673F.j.01.1PSeq 



RTA00002675F.P-06- 1 P.Seq 
RTA00002713F.h.01-l-P.Seq 



RTA00002677F.o.20.2.P.Seq 



F 
F 



M00039298B:DO3 



M00039082B:A05 



M00039955CC04 



M00039082B:A0: 



M00039266A:302 



RTA00002679F.h.O 1 1 P.Seq 



RTA00002684F.b.24. 1 .P.Seq 



RTA0000271 1 F.c. 12.1 P.Seq 
RTA00002687F_g : 03.2.P.s"eq 



RTA00002687F.f.07.1. P.Seq 



RTA00002693F.b.l2.2.P.Seq 



RTA00002674F.f.2 1 ,2.P.Seq 



RTA00002694F.J. 1 6. 1 .P.Seq 



RTA00002672F.U 7.2^.5^ 



RTA00002683F.m.08.2.P.Seq 



90746 



453002 



402863 



402526 



412778 



402273 



RTA00002672F.i.20.2. P.Seq 



RTA00002671F.a.07.2.P.Seq 



M00027379C:B0~ 



CH09LNL 



CH09LNL 



CH14EDT 



CH09LNL 



M00039425D:E1?_ 



M00039681B:H09 



M000401 15B:A04 



M000228?4A:303 



M00040207B:D08_ 



CH09LML 
CH04MAL 



M00040203A:H06 



M00043093C.GH 



M00039l35D:O0-_ 



CH09LNL 
CH09LNL_ 
CH09LNL_ 
CH03MAH_ 
CHUEDT~ 
CHUEDT__ 
CH19COP~ 
CH09LNL 



M00043SI8B:D06 



RTA00002692F.b.2 1.2. P.Seq 
RTA00002636F.n. 1 2. 1 .P.Seq 



RTA00002636F.P.07. 1 .P.Seq 



RTA00002685F.i.Q7. 1 .P.Seq 



RTA00002686Fi.l3.I.P.Seq 



374744 



375764 



428218 



374S09 



20162 



RTA000026~0F.i. 1 6. 1 .P.Seq 



RTA00002677F.Q. I 3. 2. P.Seq 



RTA0OUO:667F.c.0 1 I P.Seq 



RTA00002675F.h.0 1 I .P.Seq 



RTA00002710F.n.20.I.PSeq 



M00039015D:H04 



CH20COHLV 
CH09LNL^ 



M00040090B:G09 



M00039025/VH09 



CH09LNL 
CH09LNL 



M000:-3585D:AO2 



CH09LNL 



M00042970C:H10 



M00Q40273B.H1: 



cm scon 

CH13EDT 



M00040236C.CO: 



CH13EDT 



M00Q39533D:FO4 



M00040233C:G05 



CH12EDT 
CH13EDT 



M00033427D:F01 



M00039425CG01 



M00032731CC07 



M00039230D:DQ9 



M0002266IDG1 1 



CH0°LNL 

CHQ9LNL 

CHOSLNH 

CHQ9LNL~ 

CH03MAH 



WO 01/02568 



PCT/USOO/18374 




869 



870 



871 
872 



873 



874 
875 
876 
877 
878 
879 



880 
881 



882 



883 
884 



883 
836 



887 



888 
889 
890 
891 
892 
893 



263630 



404277 



403537 
375161 



376829 



372958 



21578 



402506 
141731 
37411 
372537 



RTA00002694F.e.lO-l.pTSeq 



RTAn0002637F.d. I 3. 1 .P.Seq 



RTA00002687F.d. 10. lPSeg 



RTA000026"6F.m.24.2. P.Seq 



RTA00002674F.f.:i.l-PSeq 



RTA00002672F.c.O:.2.P.Seq 



RTA00002709F.a.24. 1 .P.Seq 



RTA0OOO2686F.b. 17.1 .P.Seq 



RTA0000271 3F.b.Q4. 1 .P~Seg 



380834 
401492 



99998 



4043 1 1 



231084 



447679 



377012 



226207 



446183 



428508 



157643 
404609 
400464 
379108 



RTA00002661F.e.l 1.1. PSeg 



RTA00002670F.c.03.2.P.Seq 



RTA00002670F.C.03.2. P.Seq" 



RTA00002685F.n. 1 7.2. P.Seq 



RTAO0002662Fb.23.2.P.Seq 



RTA00002683Fd.2 1.2. P.Seq 



RTAOO002664F.c.l3.2.P.Seq 



RTA00002689F.b.l 1.3. P.Seq 



RT AO0O026S2F.d. 1 7. 1 .P.Seq 



RTA00002664F.d.: I l.P.S^g 



RTA000026S9F.3. 121 PSeg 



RTA00002666F.c.:4.1 P.Seq 



RTA000027 1 4F.b.:0. 1 -PSeg 



RTA0000268SF.b. 1 3. 2. P.Seq 



RTA0O00?685F.I.I0.1-P Seq 



RTA00002635F.1. 12.1 -PSeg 



%1 



M0003995 i B:C03 



M00039943A:E03 



M000393 I9B:H12 



M00039135D.G02 



M00033639D:F07 



iV10000533lCG05_ 



CH14EDT 



CH09LNL 



M00039760B:B03 



M00027212D:E03 



CH09LNL 

CH09LNL 

CH02COH_ 

CH13EDT 

CH04MAL" 



MO0OO377OA:EQ5 



CH01COH 



M00033345D.A09 



CH09LNL 



M00033346C:A05 



M00039609D:Fq7 



CH09LNL. 
CH12EDT 



M00006"I2C:H09_ 



CH02COH 



M00040394A:D04 



CH14EDT 



M00026913B:D01 
M00042560A:FI2 
M00039936C.CQ: 



CH04MAL 



CH15C0N 



M00027035D:C06 



M0004253-tA.AQj 



M0003254:B:H09 



CH09LNL 
CH04MAL 
CH15C0N_ 
CH08LNH 



M00027818C:CQ7 



M00040377C.GOJ 



CH04MAL, 

"chuedt" 



M0003959QD:D02 



CH12EDT 



M0003959iC:D06 



CH12EDT 



WO 01/02568 



PCTAJS00/18374 



CLUSTER 
374639 



925 



935 



938 



939 



940 



SEQ NAME 
RTA00003676F.d.2 1 .2.plSeq 



ORIENTATION I 



CLONE ID 
M000392S^|5 : BT2 



400517 



403578 



403578 
403371 



LIBRARY 
CH09LNL 




RTA00002687F.k. I 5.2.P.Seg 



RTA00002687F.i.0 1 .2.P.Seg 



RTA00002687F.h.24.2.P.Seq 
RTA00002687F.h.l9.2.P.Seq 




287963 RTA00002693F.c.20.:.,P.Seq 



20847 



45653 1 



450463 



456' 



455508 



RTA000027 lOF.d.09. 1.P.Seq 



RTA00002694F.b. I S. I .P.Seq 



RTA00002694F.3. 1 2. 1 .P.Seq 
RTA00002694F.d. 1 3. 1 -P.Seq 



RTA000026 c >4F.a. 1 5. 1 .P.Seq 



M00040:96D:E09 



M00040296D:E09 
MQ0Q40294D.DP 



CHUEDT_ 
CH14EDT 
CHI4EDT 



MOQ0:i852D:A05 
M000-T-446C:E1 
M0004:5^6C:DO' 



M0004:-513D:G08 



j MOO0-t:5^~B:El'- 



CH03MAH 



CH20COHLV 



CH20COHLV 



CH20COHLV_ 
CH20COHLV 



WO 01/02568 



PCT/US00/18374 



ccn 
ocy 

in 




SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


CS,\ 1 

94 1 


J /Ol JO 




F ' 




CH09LNL 


94.1 


4l)ZOJ 1 


dt a nnfifi" 1 £Rn.F m fl"? 1 P Sea 


F 


M00040"'64DG05 


CH13EDT 


94 j 


j7j8iU 


ota nnflfi7A7J.F rl flfi "* P Sen 


p 


V100039I"'7A'G1 1 


CH09LNL 


nit 
944 


85 j8S 


n t \ nnnn™>A7JP r OA "J P Sen 
K 1 AU»JUU_0 /■♦P.t.w— .->cq 


r 


I\/10nn"i9 1 "»4C~ ■ H08 


CH09LNL 


945 


400732 


D T \nnnPj1A9^F U- 7tl 7 P Sen 


p 


Wnflf)'i958"C Fl" 7 


CH12EDT 


940 


4 j 1629 


or * nnnm^AOF I 1*1 1 P Sen 


p 


M00033''76aG08 


CH08LNH 


947 


449349 


K 1 AUUUU— ovur .a. t .i.j-r.ocq 


p 


M00042802CC04 


CH16COP 


948 


401 124 


DT \ rtAnfl">AB^r n 1 M p Sen 


P 
r 


■ M000196"'9D-B04 


CH I2EDT 


949 


453233 


o T \ f\r\f\r\1 F a (IP P Sen 
K. 1 AUUUU-07jr.a.u i .— -r . .acq 


p 


M0004261 1A:A06 


CH19COP 


950 


124813 


DT \ nrtnnlAfl <c ; i (\ 7 p S»*n 
R I AUiJUU^OoJ r.j. I u~i.r-oeq 


p 


VI00039564BCO 1 


CH12EDT 


95 1 


454627 


dt* * nr\r\A"iAQ» F f 00 P Sen 
R I AUUUU-loVj r.i.uy. — r.jeq 


c 
r 


M0004 1~> IOC - E05 


CH19COP 


952 


169464 


RTAOOOUiOOjr.i. 1 v. 1 .r.^eq 


r 


.V1000''''60' ? A ■ F09 

1VIL/VJU — — UVJ — .LU7 


CH03MAH 


953 


45 1654 


RTAOuuuZoVZr .r.U-.-.r.oeq 


p 
r 


VI00043044D' \09 


CH1-8CON 


954 


406092 


RTAuUUU-ioojr.k. 1 1 ..i.r.ocq 


p 

r 


N/10nO'i9584CCI 1 


CH12EDT 


955 


453501 


RTA00002oyjr.a. I4..:.r.oeq 


p 

r 




CH19COP 


956 


450845 


RTA00002691 r.r. II). 1 .r.beq 


c 

r 




CH17COHLV 


957 


448 1 77 


RTA0000Jo9Ur.e. I-. 1 r.seq 


p 
r 


iV|UUv*+-OJ~D. D 1 i 


CH16COP 


958 


4026 1 7 


RTA00002686F.D.2 1 . 1 .P.beq 


c 
r 


Vfinnfun I * i Ft* 1 1 


CH 1 3FDT 


959 


378014 


RTA000026SOF .g. 1 /. 1 .r.ieq 


p 
r 


vinnn^07Qo a. - n 1 fl 


CH09LNL 


960 


124813 


RTA00002683F.J. 10. 1 .r.beq 


c 
r 




CH I2EDT 


961 


29450 


RTA0000266jr.a.O/. 1 .r.beq 


c 

r 




CH03MAH 


962 


400486 


RTA0O0O_68^r.e.OJ. 1 .F.seq 


r 




CH 1 2EDT 


963 


44753 


RTA00002 / 1 jF.r.0?. 1 .r.beq 


p 
r 


v 1 n n *? 7 "i 0 1 r\ - p n < 


CH04MAL 


964 


448 1 77 


RTA00002690r.e. !_._.r.beq 


p 

r 


iV| \J\J\J~*— 0 J • D - O 1 1 


CH16COP 


965 


447697 


RTA00002689F.e. o.j.r.5eq 


p 
r 


Y.1fll"W\Ll - >Qi'W A F 1 1 
iviv/vy*4— yyj- .~\ . r 1 1 


CH 1 5CON 


966 


2403 1 8 


r»T* * Q*7C i-l fVI 1 D Qjn 

RTAOOOUioo / r .a.U'+. i.r.jeq 


p 

r 


MfinO i9947 A D06 


CH14EDT 


967 


45 1620 


RTA000U269 1 r.d.JU.j.r.oeq 


F 
r 




CH17COHLV 


968 


400157 


RTAUUUUiOo )r .1 — u.-.r.jcq 


F 


V1000 5 o; >6 I 9- A09 


CH12EDT 


969 


400276 


RTA0UUU_Oojr .n. iO.-.r.ocq 


p 
r 




CH12EDT 


970 


449779 


R 1 AUUUU-O" i r .a.w.j.r.octi 


p 


fvlOOlUi 36 T B- AOS 


CH17COHLV 


971 


400157 


DT \nnAr\"^AQsP i 7Pl 1 P Sen 
R 1 AUUUU-Oo j r .1.— \J. i .r. jcq 


F 


VI00039>6 1 B- A09 


CH12EDT 


972 


238133 


R I AUUUU-OOjr.tf.U^— .r .oeq 


F 
r 


VlOnfl i9496B H09 


CH12EDT 


973 


452015 


dt \ nnArt^AQ*?P r ft7 "* P Sen 


p 


M0004298 1 B:DI 1 


CHI SCON 


974 


400732 


dta nnnn">AS^ P l 0 1 7 p Sen 




M00039587CF12 


CH12EDT 


975 


24984 


DT \ Artf1rt77 1 1 P H 7 1 IP Sen 


p 


M000" l ' 1 9 1 0 A • A06 


CH03MAH 


976 


449040 


dta nnnf»7AQnF e 14 7 p Sen 


p 


M00042841 D:H07 


CHI6COP 


977 


37748 1 


d t \ fH^nil"^? 1 P i 1 S ^ P Sen 


p 


M00038303 A;C03 


CH09LNL 


9 la 


400910 


DT a nnilfi^ARSF h 07 1 P Sen 


p 


M00039;-67B:H02 


CH12EDT 


979 


376945 


RTA000O2682F.k.23.1.P.Seq 


F 


M00040007D:.A06 


CH0 Q LNL 


980 


15906 ' 


RTA00002709F.e. 14. 1 .P.Seq 


F 


M00005805D:D12 


CH02COH 


981 


4527SI 


RTA00002692F.b.l6.2.P.Seq 


F 


M00042966B:F07 


CHI SCON 


982 


415294 


RTA00002686F.f. 14. 1 .P.Seq 


F 


M00040173D:B05 


CHI3EDT 


983 


401644 


RTA000O2685F.n.l6.I.P.Seq 


F 


M00039608D:H01 


CH12EDT 


984 


404402 


RTA000O2687F.a.l9.2.P.Seq 


F 


M00039761D.E10 


CH14EDT 


985 


401709 


RTA0OOO:6S5F.n.24.2.P.Seq 


F 


M00039624A:H09 


CHI2EDT 


986 


401644 


RTA00002685F.n. 16.2.P.Seq 


F 


M00039608D.H01 


CH12EDT 


987 


452531 


RTA00002692FT. 16.2. P.Seq 


F 


M00043I25.A:31 1 


CH1SCON - j 



^9 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBR.ARY 


983 


4009 1 0 


RTA00002685F.b.07.2.P.Seq 


F 


M00039367B:H02 


CH12EDT 


939 


449235 


RTA00002690F.a.22.3. P.Seq 


F 


M00042439B:B03 


CHI6COP 


990 


449794 


RTA0000269 1 F.c.22.2.P.Seq 


F 


M00043361B:AOI 


CHI7COHLV 


991 


40092 1 


RTA00002685F.b. 1 8. 1 .P.Seq 


F 


M0003937IB:H06 


CH12EDT 


992 


373874 


RTA00002672F.C.22.2. P.Seq 


F 


M00038663D:H10 


CH09LNL 


993 


401050 


RTA00002685F.e.09.2.P.Seq 


F 


M00039499C:A04 


CH12EDT 


Q0<1 


hjj^j / 


RTA00002693F.c.02.2.P.Seq 


F 


M00043108A:F06 


CH19COP 




449294 


RTA00002690F.C. 13.3.P.Seq 


F 


M00042770C:C04 


CH16COP 




dft.il7*fi 


RTA00002687F.C. 11.1 .P.Seq 


F 


M00039942D:C0I 


CH14EDT 


QQ*7 




RTAO0OO"*680F CT 17 ~> P Sea 


F 


M00039799A:D10 


CH09LNL 


""0 


HUH / _0 


RTA00002688F.a. 1 8.2. P.Seq 


F 


M0004037IC.H05 


CH14EDT 




4J 1 J 4 * / 


RTA00002691F.b. 1 1.3. P.Seq 


F 


M000433I 1C:E03 


CH17COHLV 


1 (\(\(\ 
1 uuu 


*in 1 1 j. 

HU I 1 Jf 


RTA0000"'685F e 06 2 P Seq 


F 


M00039497C:C06 


CH12EDT 


i nn i 


4U 1 Q / U 


RTA0000" , 686F b ~>~> 1 P Sea 


F 


M00040131C:F03 


CH13EDT 


1 flfi*7 

■ IUU- 


/inn i *7n 


RTAO0OO"'685F b 03 " 1 P Sea 


F 


M00039366C:B07 


CH12EDT 






d t A OOOfP 7 1 1 F f 1 9 1 P Sea 


F 


M0002300IC:C08 


CH03MAH 


\ UU4 


j / /Uoj 


RTAnonfPfi78F n 14 1 P Sea 


F 


M00039619B:D02 


CH09LNL 


i r\f\< 
IUUj 


403530 


RTAOOfiO-'fiSSF a 09 "* P Sea 


F 


M0004036SA:FOI 


CH14EDT 


1 AflA 

1 UUo 


J V JU 


RTA0finfP670F i P 1 P Sea 


F 


M0003343"C:A07 


CH09LNL 




,IA 1 1 "7ft 


rt AOfino~ ! fS8^ F c ~*3 "* P Sea 


F 


M00039379A.B03 


CH12EDT 




.1A11Q7 


RTAnnfifPftS7F h 0*> *> P Sea 


F 


M00040219BD02 


CH14EDT 


1 UUV 


44V J J / 


RTAfifinn"*fiQOF c 1 8 j P Sea 


F 


M00042774C:C05 


CHI6COP 


l n 1 f\ 
l \J 1 u 


4UJ DO I 


RT Afin00"'6SSF d 06 P Sea 


F 


M000403S~C:E07 


CH14EDT 


1 n 1 1 


I J4 1 o~ 


RT40000^69' ! F dl^P Sea 


F 


M000430I I A:H 12 


CH 1 SCON 






RTAOOOO"'678F n 14 1 P Sea 


F 


M00039619B:D02 


CH09LNL 


1 ft ? ^ 


»7£ 1 ^9 
J tO 1 JO 


RTA0O0O2674F.m.05. 1 .P.Seq 


F 


M00039169A:E12 


CH09LNL 


i n i j 




RTAOOOO^SSiF e 06 1 P Sea 


F 


M0003949 _ C:C06 


CH12EDT 






RTA0000^69I F b 14 3 P. Sea 


F 


M00043320B:A07 


CH17COHLV 


in i ^ 

1 U 1 u 


4UJ070 


RTA00002687F.a.04.2.P.Seq 


F 


M00039746C:H05 


CHI4EDT 


1 ft 1 7 


J / / DJ. 


RTA00OO2683F.I. IS. 2. P.Seq 


F 


M000400S"D:F08 


CH09LNL 


1 ft 1 R 




RTA0000269I F t". 10.2. P.Seq 


F 


M00043410C:A09 


CH17COHLV 


1 ft I Q 

IV 1 " 




RTA00OO269 1 F.e. 10. 2. P.Seq 


F 


M0004339; A:C10 


CH17COHLV 


1020 


402962 


RTA00002686F.d.22. 1 .P.Seq 


F 


M00040147D:H1 1 


CH13EDT 


1021 


427674 


RTA00002665F.i. 10. 1 .P.Seq 


F 


M00028775D:F03 


CH08LNH 


1022 


403252 


RTA00002688F.C. 15.2. P.Seq 


F 


M00040383D:C04 


CHI4EDT 


1023 


452038 


RTA00002692F.a.09. 1. P.Seq 


F 


M00042623D:D07 


CHI SCON 


1024 


401553 


RTA0000268JF.d.08.2. P.Seq 


F 


M000394823:G02 


CH12EDT 


1025 


451092 


RTA0000269 1 F.d. 17. 3. P.Seq 


F 


M0004337~A:C03 


CHI7COHLV 


1026 


403978 


RTA00002687F.g.09.2. P.Seq 


F 


M0004020S8:A07 


CHUEDT 


1027 


377186 


RTA0OOO2682F.m. 07.1. P.Seq 


F 


M00040014D:F03 


CH09LNL 


I02S 


404679 


RTA00OO:687F.f.O7.2.P.Seq 


F 


M00040203A.H06 


CH14EDT 


1029 


373875 


RTA00002674F.C.05.1. P.Seq 


F 


M0003912JC:H02 


CH09LNL 


1030 


128841 


RTAO0OO268f F.o. 15.2. P.Seq 


F 


M00039630C:H04 


CH12EDT 


103 I 


33971 


RTA00OO2713F.h. 13.1. P.Seq 


F 


M0002739:3:H02 


CH04MAL 


1032 


33287S 


RTA00002666F.h. 13.1. P.Seq 


F 


M000325^"C:BOI 


CH0SLNH ~~\ 


1033 


400781 


RTA0OOO;685F.j.03.2.P.Seq 


F 


M000395623:G02 


CH 12EDT 1 


1034 


456456 


RTA0000;694F.b.22.l. P.Seq 


F 


M0004344OA:E12 


CH20COHLV 



10 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLUNh iU 


! f RP ADY 
I* lur\nl\ T 


1035 


402337 


RTA000026S6F.I.07. 1 .P.Seq 


F 


M00040257D:H 10 


LH l-> tU I 


1036 


401974 


RTA00002686F.1. 1 2. 1 .P.Seq 


F 


M00040223 A :C05 


LH 1 JCUl 


1037 


455141 


RTA00002694F.b. 1 4. 1 .P.Seq 


F 


M00043440C:B07 


/**t_i*i Arr\ui \ / 
LH2ULUHL V 


1038 


402057 t 


RTA00002686F.1. 14. 1 .P.Seq 


F 


M00040260C:D04 


CH 1 jcU I 


1039 


402555 


RTA00002686F.m. 14. 1 .P.Seq 


F 


M0004026.C:C04 


u t ■*» c t\t 
Cn IjcU i 


1040 


406092 


RTA00002685F.k.l 1.1. P.Seq 


F 


M00039584C:C1 1 


/— rj i cht 
Ln l_tU i 


1041 


374351 


RTA000O2674F.I.20.1. P.Seq 


F 


M00039 147A:F10 


r-UAQI Nil 


1042 


402365 


RTA00002686F.j.08. 1 .P.Seq 


F 


M00040230 A : H02 


rrjp rriT 


1043 


401828 


RTA000O2686F.J. 1 4. 1 .P.Seq 


F 


M00040232 D: B07 


LH IjcU 1 


1044 


447669 


RTA00002689F.a. 1 5.2.P.Seq 


F 


M000425." iB:E06 


u i ■;r , nw 


1045 


402588 


RTA00002686F.k. 1 8. 1 .P.Seq 


F 


M00040254BC10 


i i cnT 


1046 


244858 


RTA00002686F.I.02.1. P.Seq 


F 


M00040256A:A06 


/~rj 1 C p\T" 

LH 1 J tU 1 


1047 


402339 


RTA00002686F.i.20. 1 .P.Seq 


F 


M00040226A:H 10 


r t_i 1 icnT 


1048 


401766 


RTA00002686F.0. 1 6. 1 .P.Seq 


F 


M00040282A:A03 


1 C p\*T" 

LH 1 J tu I 


1049 


402952 


RTA00OO2686F.g. 14. 1. P.Seq 


F 


M00040181D:HI0 


/--- i_i i -» r~ r~\"T 

LH IjcU I 


1050 


449669 


RTA000O2690F.C. 10.3. P.Seq 


F 


M00042767B:G10 


LH loLUr 


1051 


400520 


RTA000O2685F.S.04.2. P.Seq 


F 


M00039512CD06 


LH 1 2hU I 


1052 


403868 


RTA000O2687F.k.O;. 1 .P.Seq 


F 


M00040318C:H1 1 


/•II i ir r"»T" 
LH 14tU 1 


1053 


403242 


RT A000026S7F. 1.0?. 1. P.Seq 


F 


M00040323B:C12 


CH 14E.D I 


1054 


402182 


RTA000O2686F.f. 1 6. 1 .P.Seq 


F 


M00040174C.E10 


CH13EDT 


1055 


449269 


dt a ^nnn~* AQflF r I ~* j P Sea 


F 


M00042770B:B12 


CH16COP 


1056 


401290 


RTA000O2685F.n. 1 0. 1 .P.Seq 


F 


M0003 9606 B: DOS 


CH12EDT 


1057 


448420 


RTA0OOO2690F.d.07.3.P.Seq 


F 


M00042790C:C07 


CHI6COP 


1058 


374351 


RTA00002674F.i.20.2. P.Seq 


F 


M00039147A:F10 


LHUVL> L 


1059 


448464 


RTA0OOO269OF.C.OS.3. P.Seq 


F 


M00042765C:D04 


LH I OLvJr 


1060 


401079 


RTA0OOO2685F.p.0:>.2.P.Seq 


F 


M00039643C:B04 


LH 1 -tU 1 


1061 


403916 


RTA00002687F.J. 11.1 -P.Seq 


F 


M000403 14D:H05 


L H I -4tiJ l 


1062 
1063 


401374 
400503 


RTA00002685F.p.07.2.P.Seq 
RTA00002685F.k.02. 1 .P.Seq 


F 
F 
F 


M00039645C:E01 
M00039570B:D10 
M00027396D:G08 


LMLll/I 
LH I - tU 1 

L HU-+tv|.-\L_ 


1064 
1065 
1066 


219825 
377732 
380348 


RTA00002664F.h. 06.2. P.Seq 
RTA0000268 1 F.p.09.2.P.Seq 
RTA00C026S4F.d. 1 2. 1 .P.Seq 


F 
F 


M00039910C:G10 
M00040I2 1 3:C05 


^UIHQI Nil 

r-unQi Nil 


1067 


449549 


RTA00002690F.a.09.3. P.Seq 


F 


M0004243 IC:F0 1 


L H I OLUr 


1068 


402223 ' 


RTA0OOO2686F.f.O5. 1 .P.Seq 


F 


M00040 l69B:rUo 




1069 


401727 


RTA00002685F.O.23.2. P.Seq 


F 


M00039642D:H09 


CH12EDT 


1070 


j /Vo /o 


RTA00002682F.h. 12.1 .P.Seq 


F 


M000399S4A:C02 


CH09LNL 


1071 


378602 


RTA0000268 1 F.a.08.2.P.Seq 


F 


M00039839C.E05 


CHO^LNL 


1072 


448065 


RTA00002690F.C.22.3. P.Seq 


F 


M00042731 A:A07 


CH16COP 


1073 


403493 


ot Anf)fifPfiR7F i 0 ' 1 P Sea 


F 


M000403I5D:£04 


CH14EDT 


1074 


400517 


RTA00CO2687F.k. 15.1. P.Seq 


F 


MO0O4O320D:F02 


CH14EDT 


1075 


456636 


RTA00002694F.e.05. 1 .P.Seq 


F 


M00043632D:F09 


CH20COHLV 


1076 


400101 


RTA00C02685F.O.04.1. P.Seq 


F 


M00O39625B:GO8 


CH12EDT 


1077 
1078 


403578 
402419 


RTA00C02687F.i.0 1 . 1 P.Seq 
RT.A0OCO2686Fs.2O. 1 .P.Seq 


F 
F 
F 


M00040296D:£09 
M00040lS4C:Al 1 
M0003931OB:Hl2 


CH14EDT 
CH13EDT 
CH09LNL 


107? 
108C 
108 


375161 
401851 
400567 


RTAOOOO2676F.n.0 1 .2. P.Seq 
RTA00e026S6F.d.0T.I. P.Seq 
RTA0OCO2685F.a.l4.2.P.Seq 


F 

I_ F 


M00040143A:H05 
M0003936I3:£0l 


CH13EDT 
CH12EDT 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 




ORIENTATION 


CLONE ID 


LIBRARY 


1082 


376641 


RTA000026/7F.d.U 1 J.P.beq 


r 


iVIOnO 10^4 VD09 


CH09LNL 


1083 


37664 1 


RTA00002677F.c.24._.P.beq 


r 




CH09LNL 


1084 


400450 


RTA00002685F._j.22. 1 .P.Seq 


c 

r 




CH 12EDT 


108S 


375373 


RTA00002676F.h. 1 2. 1 P.Seq 


r 




CH09LNL 


1086 


375373 


RTA00002676F.h. l2.2.P.Seq 


r 




CH09LNL 


1087 


413643 


RTA00002685F.n.05.2.P.Seq 


r 




CH12EDT 


1088 


448874 


RTA00002690F.C. 02.3. P.Seq 


c 
r 




CH16COP 


1089 


37651 1 


RTA00002674F.h.04. 1 .P.Seq 


e 

r 


Norton *q i j.n a ■ RflS 


CH09LNL 


1090 


374040 


RTA00002674F.h.2 1 . 1 .P.Seq 


c 
r 




CH09LNL 


1091 


454132 


RTA00002693F.e. 1 8. 1 .P.Seq 


r 




CH19COP 


1092 


404581 


RTA00002687F.2. 1 1 . 1 .P.Seq 


r 


\a (\ n n j. cn n r n - n o 9 


CH14EDT 


1093 


260521 


RTA00002689F.C. 13.1 .P.Seq 


c 
r 




CH15CON 


1094 


379564 


RTA00002687F.0. 1 2. 1 .P.Seq 


r 




CH14EDT 


1095 


452491 


RTA00002692F.f.05.2.P.Seq 


r 


IVIUUU** JUtO t-/. D 1 I 


CHI SCON 


1096 


403541 


RTA000O2687F.p.20.2.P.Seq 


r 
r 


IVlUlJlWUJO-.^.Cw. 1 


CH 14EDT 


1097 


404636 


RTA00002688F.b.l 1.2. P.Seq 


F 




CH 14EDT 


1098 


379564 


RTA00002687F.o.l2.2.P.Seq 


F 




CH 14FDT 

V 1 1 1 ^ 1 u 1 


1099 


451548 


RTA00002691F.b.O c >.3.P.Seq 


F 


MO 004 j j 1 .uuO 




1 100 


454308 


RTA00002693F.I". 14.1. P.Seq 


F 


IV10UU4 J— 1. 3 Dl- 


CH19COP 


1 101 


40 1 1 84 


RTA00002685F.d.04.2.P.Seq 


F 


ivIOOOjVjo 1 -^ .\_uv 


CH 12EDT 


1 102 


401290 


R.TA00002685F.n.l0.2.P.Seq 


F 


ivlUUO J VOUC o . UUo 


CH12EDT 


1 103 


400101 


RTA00002685F.o.04.2.P.Seq 


F 


IvIUOUjVO O.vjUO 


CHI2EDT 


1 104 


454308 


RTA00002693F.r.l4.2.P.Seq 


F 


IV1U004j_ l.'D.Dl- 


CH19COP 

\^ [11/ V-/ 1 


1 105 


452622 


RTA00002692F.b. 14. 1 .P.Seq 


F 


N,!nr\*"\ COA" Pi-^n^ 


CH 1 SCON 


1 106 


450012 


RTA00002691F.d.09.3.P.Seq 


F 


MUUU4 j J i *J O .\-KJo 


CH17COHLV 


1 107 


400503 


RTA00002685F.k.02.2.P.Seq 


c 

r 




CH12EDT 


1108 


400450 


RTA00002685 F j. 22. 2. P.Seq 


c 
r 


vini^n to a ■ n I o 


CHI2EDT 


1 109 


446166 


RTA000026S9F.C. 1 ~. 1 P.Seq 


r 


vmnnj* , 7 1 ' A I 1 

iVlUUV-t— / I . 1 1 


CHI5CON 


1 1 10 


456233 


RTA00002694F.e.0S. 1 .P.Seq 


r 


vinflfu.^ ^ Cfifi 


CH20COHLV 


Mil 


25443 


RTA000027 1 OF. d. 1 5 . 1 .P.Seq 


r 


VinflfP 1 A(Ti 


CH03MAH 


1 1 12 


4041 19 


RTA00002688F.d. 1 7.2. P.Seq 


c 
r 


vinnrun^O"* r* ■ R 1 7 

1VIU UU-+U J ^- • D I — 


CH14EDT 


1113 


403642 


RTA00002687F.d.0 1 . 1 .P.Seq 


c 
r 




CH I4EDT 


1 1 14 


403493 


RTA00002687F.j.03.2.P.Seq 


c 

r 




CH14EDT 


1 115 


454132 


RTA00002693F.e. 1 8. 2. P.Seq 


r 


i^innnj.^ i Q i a ■ A07 


CHI9COP 


1 1 16 


450607 


RTA0000-69 1 F.d. 1 -. j. P.Seq 


c 

r 


vinnni'i ^t^cgos 

I»lUUu"tJ J ' — v- . VJv^ 


CHI 7COHLV 


1117 


451718 


RTA00002692F.e.-4.J.h'.beq 


c 

r 


M00f)4 i04-3' AP 


CH13CON 


1 1 18 


453907 


RTA0000269.>F.b.OS._.r beq 


c 

r 


V10D04 i0S" 3 G07 


CH19COP 


1 1 19 


447669 


RTA00002689F.a. 1 5. 3. P.Seq 


c 
r 


MOOlU" 1 -^ 'S3 E06 


CH15CON 


1 1">0 

1 1 —V 


404044 


RTA00002687F.p. Ill .P.Seq 


F 


M0004035iD:AI 1 


CHI4EDT 


1121 


449617 


RTA00002690F.e. 16. 2. P.Seq 


F 


M00042S4?D:F1 1 


CH16COP 


1 122 


452723 


RTA00002692F.e.lS.2.P.Seq 


F 


M00043036C:E05 


CHI SCON 


1123 


270014 


RTA000026S5F.i. 1 5. 2. P.Seq 


F 


M0003953cC:HI 1 


CHI2EDT 


1 124 


401 198 


RTA00002685F.i.l4.2.P.Seq 


F 


M00039536C:C10 


CHI2EDT 


1 125 


452414 


RTA00002692F.e. 12.1 .P.Seq 


F 


M000-»3O32C:AIO 


CHI8CON 


1126 


453019 


RTA00002692F.d. 1 S. 2. P.Seq 


F 


M000430I5 A:H10 


CHI SCON 


1127 


403642 


RTA00002687F.C.24.1. P.Seq 


F 


M0003994fC:F09 


CH14EDT 


1128 


401437 


RTA00002685F.C. 1 S. 2. P.Seq 


F 


M00039377D:E12 


CH12EDT 



PCT/US00/18374 

WO 01/02568 



s 

1 

1 


EQ 
ID 

129 " 

130 

131 

132 
1133 
1134 
1135 
1136 


CLUSTER 1 
452414 1 
404122 1 
400567 M 
401437 p 
404642 
376007 1 
402835 | 
403774 


SEC NAME C 
lTA00002692F.e. U.2.P.Seq 
lTA0000268~F.n. 1 0. 1 .P.Seq 
ITA00002685F.X 14. 1 P.Seq 
*TA00002685F.c.lS.I.PSeq 
RTA00002687F.f.02.l. P.Seq 
RTA00002676F.f.22.2.PSeq 
RTA0000-'686F.b.24.1. P.Seq 
RTA00002687F.d.08. 1 .P.Seq 


>RIENTAT1QN 
F 
F 
F 
F 
F 
F 
F 
F 

F L 


CLONE ID 
Mnnnj70"i"'C' A 10 

IVIUUU** J W J — ■ • * 1 ^ 

M00040334D:B02 

M00039361B:E01 

M00039377D:E12 

M00040201C:G11 

M00039293B:C11 

M00040I31D:G08 

M00039947C:G03 

M00023377B:F01 


LIBRARY 
CHI SCON 
CHUEDT 
CH12EDT 
CH12EDT 
CHUEDT 
CH09LNL 
CH13EDT 
CHUEDT 
CH04MAL 




1137 

1 138 

1139 

1140 

1141 

1142 

1143 

1144 

1145 

1 146 


45505 

449832 1 

379004 

455211 ! 

379021 j 
376279 | 
374373 [ 
97668 [ 
400407 J 


RTA000027 1 2F.d.04. 1 .P.Seq 
RTA00002692F.c.05.2.P.Seq 
RT A0000269 1 F .e. 1 3 . 1 . P.Seq 
RTA00002683F.n.09.2.P.Seq 
RTA00002694F.b.07.1. P.Seq 
RTA00002683F.n. 13 .2.P.Seq 
RTA00002680F.d. 10.2.P.Seq 
RTA0000268 1 F.n.2 1 . 1 -P.Seq 
RTA00002686F.d. 1 9. 1 .P.Seq 
RTA00002635F.a.05.2. P.Seq 


F 
F 
F 
F 

F i 

F 

F 

F 

F 


M00042979B:E02 

M00043393A:B08 

M00040093B:C02 

M00043430B:C02 

M00040093D:D03 

M00039785DG05 

M00039903A:H07 

M00040145D:D03 

M00039134A:D03 


CHI SCON 
CH17COHLV 

CH09LNL 
CH20COHLV 
CH09LNL 
CH09LNL 
CH09LNL 
CHUEDT 
CHUEDT 
CHUEDT 




1 147 
1148 
1149 
1 150 
1151 
1152 
1153 
1 154 
1155 


40291)4 ] 

403912 I 

40051 1 

402746 

403849 

401471 

404362 

373641 

401952 


RTA00002686F.n. 15.1 .P.Seq 
RTA00002687F.j.l9.1. P.Seq 
RTA00002685F.b.23.2.P.Seq 
RTA00002686F.a. 1 4. 1 .P.Seq 
RTA00002687F.n.09.2. P.Seq 
RT AOOOO^Sf F.o. 1 0. 1 .P.Seq 
RTA00002687F.O.06.2. P.Seq 
RTA00002677F.i.09.2.P.Seq 
1 RTA00002686F.j. 10.1. P.Seq 


F 
F 
F 

F 1 

F 

F 

! F 

L F 

F 


M000402"4A:H1 1 

M00040317A:H03 

M00039372C-.D12 

M00039740B:FI0 

M00040333D.GOD 

M00039629B:F01 

M00040342B:DI2 

M00039403A-.G12 

M00040231B:C08 

M00039f-17'D:F04 


CH UEDT 
CHUEDT 
CHUEDT 
CH14EDT 
CHUEDT 
CHUEDT 
CH09LNL 
CHUEDT 
CHUEDT 




1156 
1157 
1 158 
1 159 
1160 
H61 
1162 
1163 
1164 
116: 


400685 
402689 
380462 
400078 
373748 
401392 
20548 
376279 
374428 
374428 


1 RTA0OO0:685F.m.09.2.P.Seq 
1 RTA000O2686F.n.O?.l. P.Seq 
1 RTA00002670F.O.0 1.2. P.Seq 
| RT-\00002685F.m. 1 3.2.P.Sec 
1 RTA00002671F.1.06.3.P.Seq 
1 RTA00002685F.f.08.2.P.Seq 
1 RTA00002710F.h.l5.l.P.Seq 
' RTA00002680F.d. 10. 1. P.Seq 
I RTA00002672F.a.20.l.P.Seq 
" RTA00002672F.a.20.2. P.Seq 


F 
F 

c 
r 

F 

F 

F 

F 

c 
r 

T F 
F 
F 


M00040271B:E12 

Mnnn3 3^"0BE06 

M00039600A:A11 

M00033325D:F12 

M00039505C-.E03 

M00022247A:E02 

1^10003978^0:005 

M00038633B-.G02 

M0003863 3B:GO2 

M00039696A:E0> 


CHUEDT 
CH09LNL 
CHUEDT 
CH09LNL 
CHUEDT 
CH03MAH 
CH09LNL 
CH09LNL 
CH09LNL 
CH09LNL 




1I6< 
116' 
116! 
116' 
117 


> 372914 
J 378320 
J 235422 
5 402473 
D 374828 


" RTA00002679F.j.21.1. P.Seq 
" RT\00002681F.l. 14.2. P.Seq 
"1 RT\00002665F.h.l9.I.P.Sec 
RTA00002686F.pl 1.1. P.Sec 
" RT-\00002674F.m.l0.1.P.Se 


F 

1 F 
1 F 
q F 

F 


M00039894C:H07 
!unnrPS"6SC'D05 
M000402S"C:B09 
M00039170A:B10 
M0004031"A:H0j 


CH09LNL 
CHOSLNH 
CHUEDT 
CH09LNL 
CHUEDT 




117 
117 
117 
117 
1 17 


1 403912 

2 401471 

3 404362 

4 403849 

5 395617 


RTA000O2687F.j.l9.:.P.SeL 
RT\00002685F.o.lO.:.P.Sei 
1 RTA00002637F.o.06.1.P.Se 
1 RTA0000268"F.n.09.1.P.Se 
1 RT\00OO:687F.b.l:>.l-P.Se 


3 F 
3 F 
q F 
q F 


M0003962°B:F01 
M00040342B:D12 
MOOO4O333D:G0b 
MOOO3976"B:A04 


CHUEDT 
CHUEDT 
CHUEDT __ 
CHUEDT 



WO 01/02568 



PCT/USOO/18374 



SEQ 

ID 

1 76 
1177 



CLUSTER 
40 1 709 



404464 



SEQ NAME 
RTA00002685F.o.0l.2.P.Seq 



ORIENTATION 
F 



RTA00002637F.O.22. I .P.Seq 



RTA00002639F.e.06.3. P.Seq 



CLONE ID 
M00039624A:H09 



M00040347D:F09 



M00042S95C:G01 



LIBRARY 
CH12EDT 



CH14EDT 



CHlfCON 



1178 
1179 



18139 



RTA00002708F.r.l0.1.PSeq 



M00004139B.BIO 



403898 
453512 



RTA00002687F.a.05. 1 -P.Seq 
RTA00002693 F.a.2 1 -2-P.Seq" 



_F_ 
F 



M00039746C:H06 
M00043078D:D04 



CH14EDT 
CH19COP 
CH14EDT 



1182 



1183 



1184 



1185 



404172 



RTA0000268~F.d. 1 7. 1 .P.Seq 



M0003995IB:B12 



400973 



RTA00002685F.c.06.2.P.Seq 



M00039374CH12 



CH12EDT 



450198 



RTA 0000269 lF.e.23.2.P.Seq 



M00043405A:D1 1 



CHI7COHLV 



451502 



RTA0000269 I F.f.03.2.P.Seq 



M00Q43406BG12 



CH17COHLV 



RTA00002693F.f.l8.2.P.Seq 



_F_ 
F 



M00043220B:C04 



CH19COP 



1 186 
1 187 



188 



1189 



1190 



1191 



1192 



1193 



1 194 



1195 



196 



197 



1198 



1199 



1200 



1201 



1202 



1203 



1205 



1206 



1207 



1208 



1209 



1210 



121 ! 



1212 



1213 



1214 



1215 



1216 



1217 



1218 



1219 



1220 



1221 



1222 



454414 
453752 



RTA00002693F.b.02.2P.Seq 



403700 



RTA00002687F..g.03. 1 P-Seq 



403371 



RTA00002687F.h. 1 9. 1 P-Seq 



14583 



404161 



403274 



373465 



402582 



40224 1 



380451 



455938 



374297 



402624 



402322 



449504 



226704 



271092 



400864 



235855 



402789 



19826 



380157 



401187 



427346 



402S66 



376712 



401655 



400147 



400864 



451600 



400147 



401655 



449307 



405121 



451718 



294345 



M00043081D:F05 



M00040207B:D08 



CH14EDT 



RTA00002687F.f.08. 1. P.Seq 



MOQ040294D:PI2 
M00040203B:A05 



CH14EDT 



CH14EDT 



RTA00002687F.e.20. 1 -P.Seq 



M00039958CB09 



CH14EDT 



RTA000026S7F.b. 10. 1 -P.Seq 



M00039766A:G07 



CH14EDT 



RTA0000267 1 F.o.09. 1 -P.Seq 



M00038615A:H1 



CH09LNL 



RTA00002686F.m.Q3. 1. P.Seq 



M00040265D:C08 



CH13EDT 



RTA00002636F1.16.1.P.Seq 



M00040261CF01 



CH13EDT 



RTA00002670F.p. 12. 1 .P.Seq 



M0003358ID:D03 



CH09LNL 



RTA00002694F.d.24. 1 .P.Seq 



MO0O43528C:A02 



CH20COHLV 



RTA000026T2F.i.02.2.P.Seq 



RTA00002686F.P. 13.1 -P-Seq 



M00039013D.F02 



CH09LNL 



M000402S7D:P07 



CH13EDT 



RTA00002636F.i.l6.1.P.Seq 



M00040233A.H02 



CH13EDT 



RTA00002690F.C.I 1. 2. P.Seq 



M00042769C:E09 



CH16COP 



RTA00002664F.a. Ill P-Seq 



RTA00002690F.b.23.2. P.Seq 



RTA00002635F.g.l7.2.P.Seq 



RTA0000266:F.o.06. 1 P.Seq 



M00023352D:H03 



CH04MAL 



M00042756D:AIO 



CH16COP 



M0003951~B:G12 
M00032876C:D06 



CH12EDT 



CH08LNH 



RTA00002636F.g. 1 6. 1 P.Seq 



RTA000027 IOF.k.05. 1 .P.Seq 
RTAO00O26S2F.h. 19. 1 .P^Seq 



RTA00002635F.e.l5.2.P.Seq 



RTA00002665F.b.OI .3. P.Seq 



RTA000026S6Fc.l5.I.P.Seq 
RTA000026/7F.C. l3.2.pTSeq 



RTA00002685F.C.22. l.P.Seq 



RTA0000268; F.g. 10. 1 .P.Seq 
RTA00002635 F.g. 17.1 .P.Seq 



RTA0000269 1 F.b. 19.3.P.Seq 



RTA000026S5F.g. 10.2.P.Seq 



RTA00002635F.c.22.2.P.Seq 



RTAO0Q0:69OF.a.l0.3.P.Seq 



RTA00002633F.3.0l.2.P.Seq 



RTA000026°2F.e.24. l.P.Seq 



RTA000026S5F;:. 14. 1 .P.Seq 



M00040183.VFO" 



CH13EDT 



M00022467C:B12 



CH03MAH 



M00039984D:G12 



CH09LNL 



M00039500C:C04 



M00028066C:D07 



M00040I38B:H03 



M0003954?B:F12 



M000393"8D-.HO' r 
M000395I5A:A06 



M000395ITB:G12 



M00043328D:H02 



M000395I5A.A06 



M0003937SD:HO" 



M0004243ID:C10 



CH12EDT 



CH08LNH 



CH13EDT 



CH09LNL 



CH12EDT 



CHI2EDT 



CH12EDT 



CH17COHLV 



CH12EDT 



CH12EDT 
CH16COP 



M00040366A:B01 



M00043044B:A12 



M00039515D:C1 I 



CH14EDT 



CHI SCON 



CH12EDT 



WO 01/02568 



PCTAJS00/18374 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ;d 


LIBRARY 


1270 


377039 


RTA00002686F.O. 1 2. 1 P.Seq 


F 


M0004028l)C:H05 


CH13EDT 


1271 


18041 


RTA000027 1 OF.h.2 1 . 1 .P.Seq 


F 


M00022262DG03 


CH03MAH 


1272 


401381 


RTA00002685F.O.08.1. P.Seq 


F 


M00039626D:F04 


CH12EDT 


1273 


428491 


RTA00002666F.C.05. 1 .P.Seq 


F 


M00032535D:H01 


CH08LNH 


1274 


54656 


RTA00002661 F.i.22.2.P.Seq 


F 


M000043 _ 2B:F07 


CH01COH 


1275 


379183 


RTA00002679F.i. 1 7. 1 .P.Seq 


F 


M000396SSC.G06 


CH09LNL 


1276 


25594 


RTA0000271 IF. f.07.1. P.Seq 


F 


M00022963B:E02 


CH03MAH 


1277 


403355 


RTA00002687F.d. 11.1 .P.Seq 


F 


M00039948D:DI 1 


CH14EDT 


1278 


16789 


RTA00002709F.b.09.2.P.Seq 


F 


M000O5382B:FO8 


CH02COH 


1279 


23292 


RTA00002708F.C.02. 1 .P.Seq 


F 


M000O375OD:E06 


CHOICOH 


12S0 


373982 


RTA00002673F.b.24.2.P.Seq 


F 


M00039058A:A04 


CH09LNL 


12SI 


373982 


RTA00002673F.c.OI.2.P.Seq 


F 


M0003905SA:A04 


CH09LNL 


1282 


44991 1 


RTA0000269 1 F.e.02.2.P.Seq 


F 


M00043384B:B02 


CH17C0HLV 


1283 


450633 


RTA0000269lF.f.02.2.P.Seq 


F 


M00043405C:G12 


CH17COHLV 


1284 


23939 


RTA000027l3F.j.I4.[.P.Seq 


F 


M00027486A:F06 


CH04MAL 


1285 


450633 


RTA0000269 1 F.f.02. 1 .P.Seq 


F 


M00043405C:GI2 


CH17C0HLV 


1286 


379122 


RTA00002672F.n. 14. 1 P.Seq 


F 


M0003903'5B:F09 


CH09LNL 


1287 


449429 


RTA00002690F.a. 16. 3. P.Seq 


F 


M00042-i:"A:D04 


CHI6C0P 


1288 


430578 


RTA00002668F.2. 18.1 .P.Seq 


F 


M000329S-C:G05 


CH08LNH 


1289 


425824 


RTA00002687F.0. 1 7. 1 .P.Seq 


F 


M0003976~C:EI2 


CH14EDT 


1290 


425824 


RTA00002687F.b. 1 7.2. P.Seq 


F 


M0003976"C:E12 


CH14EDT 


1291 


401266 


RTA000026S5F.i. 1 1.2. P.Seq 


F 


M00039535D:DI0 


CHI2EDT 


1292 


377949 


RTA00002674F.p.04. 1 P.Seq 


F 


MO003920OA:CI0 


CH09LNL 


1293 


12926 


RTA000027 10F.e.2 1 . 1 .P.Seq 


F 


M00022005C.C06 


CH03MAH 


1294 


378242 


RTA00002679F.c.20.2.P.Seq 


F 


M0003966-iD:G07 


CH09LNL 


1295 


401781 


RTA00002686F.e.08. 1 .P.Seq 


F 


M00040160B:AI0 


CH13EDT 


1296 


453101 


RTA00002693F.c.l6.2.P.Seq 


F 


M00043U3 3A10 


CH19COP 


1297 


377592 


RTA00002677F. 1.12. 2. P.Seq 


F 


M000394i5D:E0I 


CH09LNL 


1298 


404340 


RTA000026S7F.b.05. 1 P.Seq | F 


M00039TO-CD07 


CHI4EDT 


1299 


400968 


RTA00002635F.h.0 1.2. P.Seq 


F 


M000395I 1 D:H03 


CHI2EDT 


1300 


400968 


RTA00002685F.2.24.2. P.Seq 


F 


M0003952;D:H03 


CH12EDT 


1301 


374417 


RTA0000267 1 F.j. 1 5. 3. P.Seq 


F 


M000383 ifC:Gl 1 


CH09LNL 


1302 


374621 


RTA00002675F.p,02.1. P.Seq 


F 


M00039263D A 12 


CH09LNL 


1303 


19063 


RTA00002708F.i. 14. 1 .P.Seq 


F 


M0000436! A:H02 


CHOICOH 


1304 


135941 


RTA000027 I3F. g.06.1. P.Seq | F 


M0002735> : >B:G05 


CH04MAL 


1305 


403355 


RTA00002687F.d. 1 1 .2. P.Seq 


' F 


M000399-iSD:DI 1 


CH14EDT 


1306 


375226 


RTA00002677F.m.08.2.P.Seq 


F 


M000394 i "C:A01 


CH09LNL 


1307 


222658 


RTA00002664F.e. 14. 2. P.Seq 


F 


M0002710:-3:A09 


CH04MAL 


1308 


447978 


RTA00002690F.d. 1 1 .3. P.Seq 


F 


M00042SOOA:A03 


CH16C0P 


1309 


431346 


RTA00002669F.ii.24. 1 P.Seq 


F 


\irsf\r\~ *iKi -f~nj 
MUUUj-' - l 5A.\.U-t 




1310 


455579 


RTA00002694F.a. 10. 1 .P.Seq 


F 


M000425>- , c?3:F06 


CH20COHLV 


131 1 


13406 


RTA00002709F. 1.14.1. P.Seq 


F 


M0000712-D-HI0 


CH02COH 


1312 


378364 


RTA00002674F. 0.17.1. P.Seq 


F 


MOOOJ-^^cD A07 


CH09LNL 


1313 


373788 


RTA0000267 1 F.c. 16. 2. P.Seq 


1 F 


M0003825-A:G0S 


CHO^LNL 


1314 


403548 


RTA0000:68SF.a. 10. 2. P.Seq 


I F 


M00040.-o5D:E09 


CH14EDT 


1315 


22425 


RTA00002709F.c.08.2.P.Seq 


1 F 


M000054OSa:H06 


CH02COH 


1316 


452238 


RTA00002692F.C.2 1 .2. P.Seq 


1 F 


M00042°°SA.G04 


CHI SCON 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



CLUSTER 



SEQ NAME 



ORIENTATION 



CLONE ID 



LIBRARY 



446680 



RTA00002689F.C.04. 1 .P.Seq 



M00042693D:E04 
M00026860B:C05 



1318 



1319 



1320 



1321 



142922 



RTA000027 12F.g.02.1.P.Seq 



450196 



RTA0000269 1 F.c. 1 9.3.P.Seq 



M00043359B:D10 



CH17COHLV 



26017 



RTA000027O9F.d.O4. l.P.Seq 



M00005601D:D08 



CH02COH 



380355 



RTA00002670F.Q.06. l.P.Seq 



M00033570C:CI0 



CH09LNL 



1322 



1323 



25232 



RTA000027 IQF.n.22. l.P.Seq 



M00022667D:B02 



378952 



RTA00002683F.h.l 1. l.P.Seq 



M00040070B:B07 



CH09LNL 



404487 



RTA00002687F.C. 1 3.2.P.Seq 



M000399a3B:FI0 
M00027159D:F03 



CH14EDT 



1325 



1326 



1327 



1328 



1329 



1330 



1331 



1332 



1334 



1335 



1336 



1338 



1339 



48482 



RTA000027 1 2F.p.06. 1 .P.Seq 



373705 



RTA00002673F.a. 1 3. l.P.Seq 



373705 



RTA00002673F.a. 13.2. P.Seq 



M00039052C:F07 
M00039052C:F07 



CH09LNL 



CH09LNL 



21162 



RTA00002709F.C.03. l.P.Seq 



M000054-^9B:D01 



CH02COH 



15203 



RTA000027 1 0 F.a.21. l.P.Seq 



M00007972B:HI2 



CH03MAH 



21162 



RTA00002709F.c.03.2.P.Seq 



M00005449B:D01 



CH02COH 



401013 



RTA00002685F.0. 16.2.P.Seq 



404449 



RTA00002687F.c.04.2.P.Seq 



429672 



RTA00002668F.b. 10. 1 .P.Seq 



48541 



RTA000027 1 2F.i.07. 1 .P.Seq 



378424 



RTA0000268 1 F.a.03.2. P.Seq 



49540 



RTA000027 1 2F.d.24. 1 .P.Seq 



379170 



RTA00002672F.i.2 1 . 1 .P.Seq 



179540 



RTA00002683F.O.20.2. P.Seq 



451269 



RTA0000269 1 F.f.l I.I .P.Seq 



M0003964!A:A05 



CH12EDT 



M00039770C:E04 



CHI4EDT 



M000329G9A:B06 



CH08LNH 



M00026922OB02 



CH04MAL 



M00039839B:B0I 



CH09LNL 



M00023399C:EI0 



CH04MAL 



M00039016D:G06 



CH09LNL 



M00040IOOC:E05 



CH09LNL 



RTA0000269 1 F.e. 1 3.2.P.Seq 



M0004341 S B:D08 
M00043393A:B08 



CH17COHLV 



CH17COHLV 



1341 



1342 



1343 



1344 



1345 



1346 



1347 



1348 



1349 



1350 



1351 



1352 



1353 



1354 



1355 



1356 



1357 



1358 



1359 



1360 



1361 



1362 



1363 



3801 19 



RTA00002670F.m.20.2.P.Seq 



M00033560D.G07 



153094 



RTA00002714F.a. 12.1 .P.Seq 



M00027743.-VC03 



CH04MAL 



448749 



RTA00002690F.d. 14.2. P.Seq 



M00042806C:F07 



CH16COP 



448749 



RTA000O2690F.d. 14.3. P.Seq 



MO00^2SO6c:F0" 



CH16COP 



454816 



RTA00002693F.b. 16. 1 P.Seq 



M00043096A:G04 



CH19COP 



374744 



RTA00002670F.i. 16.2. P.Seq 



M00O3342"D:F0l 



CH09LNL 



404449 



RTA00002687F.C.04. 1 .P.Seq 



58005 



RTA0000266 1 Fh. 14. l.P.Seq 



451379 



RTA0000269 1 F.b. 12.3. P.Seq 



4563 



RTA00002694F.d.2 1 . 1 .P.Seq 



455957 



RTA00002694F.C. 1 5.1 .P.Seq 



428063 



374722 



RTA00002666F.I.05. 1 .P.Seq 
RTA00002676F.j.l9.3.P.Seq 



428407 



RTA00002665F.p. 12. 1 P.Seq 



378000 



RTA0000268 1 F.j.l6.I.P.Seq 



452717 



RTA00002692F.b. 1 7.2.P.Seq 



378000 



RTA00002 68 1 F.j. 16.2. P.Seq 



448356 



456629 



431346 



377206 



453036 



402632 



RTA00002690F.c.03.3.P.Seq 



RTA00002694F.d.04. 1 .P.Seq 



RTA0000:669F.g.24.2.P.Seq 



RTA0000:682F.m. 14. l .P.Seq 



RTA0000:692F.b.l 1.2.P.Seq 



RTA00002686F.g. 15.1 .P.Seq 



M000397~OC:E04 



CH14EDT 



M00004222C:E03 



M00043312C:E08 



M000435263-.D10 



M00043465C:A03 



M0003263SC:G08 



M00039310A:C07 



M00032510D:F12 



M000398S"D:C04 



M00042966C:E06 



M0003988"D:C04 



M00042760'A:C12 
C:F04 



M0004349 i 
M000332IS 



A:C04 
M00040b"l5C:F08 



M00042960D-.H08 



M00040I82D.D06 



CH01COH 



CH I 7C0HLV 



CH20COHLV 



CH20COHLV 



CH08LNH 



CH09LNL 



CH08LNH 



CH09LNL 



CHI SCON 



CH09LNL 



CHI6C0P 



CH20COHLV 



CH08LNH 



CHO°LNL 



CHI SCON 
CH13EDT 



WO 01/02568 



PCTVUS00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1364 


230532 


RT.A00002664F.cl 1.2. P.Seq 


F 


M00O26901 A:G07 


CH04MAL 


1365 


30755 


RTA0OO02663 F.e.03. 1 .P.Seq 


F 


M000221 3SA:E05 


CH03MAH 


136.6 


451438 


RTA0000269 1 F.d.23.3. P.Seq 


F 


M0004338:-C:F12 


CH17COHLV 


1367 


37901 1 


RTA0000268 1 F.n.23. 1 .P.Seq 


F . 


M00O399O3CDO1 


CH09LNL 


1368 


404048 


RTA00002687F.g.O 1 . 1 .P.Seq 


F 


M00040206A:A07 


CH14EDT 


1369 


404048 


RTAO0OO2687F.g.0 1 .2.P.Seq 


F 


M00040206A:A07 


CH14EDT 


1370 


452398 


RTA00002692F.f. 1 7.2.P.Seq 


F 


M00043125C.AI1 


CHI SCON 


1371 


403686 


RTA00002687F.d.03. 1 .P.Seq 


F 


M000399463:F08 


CH14EDT 


1372 


403686 


RTA00002687F.d.03.2.P.Seq 


F 


M00039946B:F08 


CH14EDT 


1373 


404048 


RTA00002687F.f.24.2.P.Seq 


F 


M0004O2O6A:A07 


CH14EDT 


1374 


404048 


RTA00002687F.f.24.1. P.Seq 


F 


M00040206A:A07 


CH14EDT 


1375 


450627 


RTA0000269 1 F.f.0 1 .2.P.Seq^ 


F 


M00043405CG02 


CH17COHLV 


1376 


375589 


RTA00OO2680F.f.06.2.P.Seq 


F 


M00039794.-VE04 


CH09LNL 


1377 


37901 1 


RTA0000268 1 F.n.23. 2.P.Seq 


F 


M00039903C:D01 


CH09LNL 


.1378 


16789 


RTA00002709F.b.09. 1 .P.Seq 


F 


.Vt00005382B:F08 


CH02COH 


1379 


427346 


RTA00002665F.a.24.3.P.Seq 


F 


M0002S066C:D07 


CH08LNH 


1380 


49540 


RTA000027 1 2F.e.O 1 . 1 .P.Seq 


F 


M00023399CE10 


CH04MAL 


1381 


14440 


RTA00OO2674F.e. 14.2.P.Seq 


F 


M00039129C:D04 


CH09LNL 


1382 


391401 


RTA00002632F.k. 11.1 .P.Seq 


F 


M00040004D:B03 


CH09LNL 


1383 


43782 


RTA00002662F.d.2 1 ,2.P.Seq 


F 


M00007I65B:GI 1 


CH02COH 


1384 


212635 


RTA00002666F.p.0 1 . 1. P.Seq 


F 


M000326SSD:DI t 


CHOSLNH 


1385 


15618 


RTA000027 1 OF.o.05. 1 P.Seq 


F 


V10002268-iA:C02 


CH03MAH 


1386 


18501 


RTA00002669F.g.23.3.P.Seq 


F 


M0003321"B.H07 


CHOSLNH 


1387 


400310 


RTA00002638F.b. 05.2. P.Seq 


F 


M00040375C:B06 


CH14EDT 


1388 


403796 


RTA00OO2687F.H. 1 7. 1 .P.Seq 


F 


M00040293D:G04 


CH14EDT 


1389 


452314 


RTA00002694F.3.2 1 . 1 .P.Seq 


F 


M00043416C:A02 


CH20COHLV 


1390 


1 1 9 1 79 


RTA000027 l2F.k.20. 1 .P.Seq 


F 


M0002702 1 A:G02 


CH04MAL 


1391 


167451 


RTA00002663FJ.I 1.1. P.Seq 


F 


M00022646A:H10 


CH03MAH 


1392 


450523 


RTA0000269 1 F.e. 1 9.2. P.Seq 


F 


M0004340I D:G0S 


CH17COHLV 


1393 


289535 


RTA00002693F.f.06.1. P.Seq 


F 


M000432023:F01 


CH19COP 


1394 


374736 


RTA00002673F.0. 08.2. P.Seq 


F 


M000391 12B:C05 


CH09LNL 


1395 


378912 


RTA00002672F.n.OI.2.P.Seq 


F 


M00039036C:B05 


CH09LNL 


1396 


134877 


RTA00OO2662F.d.05.2. P.Seq 


F 


M00007026B:H09 


CH02COH 


1397 


37281 1 


RTA00002670F.C. 1 2.2. P.Seq 


F 


M0003334 - C:F02 


CH09LNL 


1398 


373296 


RTAO0O02672F.e.08.2.P.Seq 


F 


M00038994A:A10 


CH09LNL 


1399 


373296 


RTA00002672F.e.0S. 1 P.Seq 


F 


M00038994.A:A10 


CH09LNL 


1400 


452903 


RTA000O2692F.t'.0S.2.P.Seq 


F 


M0004306OD:G12 


CHI SCON 


1401 


450067 


RTA0000269 1 F.c. 1 7.3. P.Seq 


F 


M00O43352D:CO3 


CH17COHLV 


1402 


451013 


RTA0000269 1 F.f.08. 1 .P.Seq 


F 


M0004340°B:B03 


CH17COHLV 


1403 


212635 


RTA00002666F.O.24. 1.P.Seq 


F 


\ f t\r\ A'* 1*1 o o . P\ 1 \ 

MOUUjJooaiJ.Ul L 


f~nnsi \jw 
v_ nuOL.^n 


1404 


452367 


RTA00002692F.C.02.2. P.Seq 


F 


M00042976A:H04 


CHI SCON 


1405 


450627 


RTA0000269IF.e.24.1. P.Seq 


F 


M0004340fC:G02 


CH17COHLV 


1406 


186438 


RTA000027l3F.i. 15.1. P.Seq 


F 


M00027462A:D07 


CH04MAL 


1407 


431066 


RTA0000:669F.c. 1 7.3. P.Seq 


F 


M00033IS C! D:F0S 


CHOSLNH 


1408 


378912 


RTA0OOO_672F.m.24.2.P.Seq 


F 


M00039036C:B05 


CH09LNL 


1409 


15731 


RTA00002709F.1. 1 3. 1 .P.Seq 


F 


M00007I 16C:G02 


CH02COH 


1410 


377187 


RTAO0OO:68JF.d.2 1.2. P.Seq 


F 


M0004004 - C:F05 


CH09LNL 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


141 1 


376107 


RTA0O0O2677F.a.08.2.P.Seq 


F 


M00039333D:D09 


A CH09LNL 


1412 


450580 


RTA0O0O2691F.C.20.3. P.Seq 


F 


M00043359C:G01 


CH17COHLV 


1413 


379942 


RTA00002679F.I.2 1 . 1 .P.Seq 


F 


M00039707A:D02 


CH09LNL 


1414 


375589 


RTA00002680F.f.06. 1 .P.Seq 


F 


M00039794A:E04 


CH09LNL 


1415 


375789 


RTA0O0O2674F.a. 1 6. 1 .P.Seq 


F 


M00039I20CH03 


CH09LNL 


1416 


456227 


RTA00002694F.C. 16. 1 .P.Seq 


F 


M00043465C.C09 


CH20COHLV 


1417 


455852 


RTA00002694F.a.02. 1 .P.Seq 


F 


M00042592A:H10 


CH20COHLV 


1418 


25169 


RTA00002710F.m.05. 1 .P.Seq 


F 


. M00022579CC1I 


CH03MAH 


1419 


376524 


RTA00002678F.h.23.2.P.Seq 


F 


M00039477A:B03 


CH09LNL 


1420 


449562 


RTA00002690F.b. 1 3.2.P.Seq 


F 


M0004251 5C: F08 


CH16COP 


1421 


449562 


RTA00002690F.b. 13.3. P.Seq 


F 


M00042515CF08 


CH16COP 


1422 


286001 


RTA00002690F.b.08.2.P.Seq 


F 


M0004251 1A:H04 


CH16COP 


1423 


286001 


RTA00002690F.b.08.3.P.Seq 


F 


M000425 1 1A:H04 


CH16COP 


1424 


380322 


RTA00002683F.p.2 1 . 1 .P.Seq 


F 


M00040106B:309 


CH09LNL 


1425 


401603 


RTA00002685F.f.23.2.P.Seq 


F 


M000395 IOC:G02 


CH12EDT 


1426 


376541 


RTA000O2678F.d. 1 3.2. P.Seq 


F 


M00039456A:C08 


CH09LNL 


1427 


449123 


RTA00002690F.a. 1 3.3. P.Seq 


F 


M00042435A:A1 1 


CH16COP 


1428 


418358 


RTA00002686F.m.07. 1 .P.Seq 


F 


M00040265D.B07 


CH13EDT 


1429 


380263 


RTA00002689F.3.22. 1 P.Seq 


F 


M00042543CG04 


CHI5C0N 


1430 


455748 


RTA00002694F.b.06.1. P.Seq 


F 


M00043428D:G08 


CH20COHLV 


1431 


451679 


RTA00002693F.a.04.2. P.Seq 


F 


M000426I2D:F06 


CH19C0P 


1432 


396332 


RTA00002686F.k. 14. 1 .P.Seq 


F 


M00040252CC06 


CHI3EDT 


1433 


377578 


RTA00002683F.b.l 1.2.P.Seq 


F 


M00040037A:E1 1 


CH09LNL 


1434 


20061 


RTA000027 lOF.m. 1 4. 1 .P.Seq 


F 


M00022597D:A06 


CH03MAH 


1435 


402494 


. RTA00002686F.h. 16. 1 .P.Seq 


F 


M00040191A:B09 


CH13EDT 


1436 


372798 


RTA00002670F.C. 1 8.2. P.Seq 


F 


M00033349D:F05 


CH09LNL 


1437 


236295 


RTA00002679F.a.l9.2.P.Seq 


F 


M00039655B:H09 


CH09LNL 


1438 


451570 


RTA0000269IF.c.03.3.P.Seq 


F 


M00O43340B:H0S 


CH17COHLV 


1439 


35847 


RTAO0O02708F.h.03.l. P.Seq 


F 


M0000423>?B:F1 1 


CHOICOH 


1440 


455706 


RTA00002694F.b. 1 0. 1 .P.Seq 


F 


M00043433B:G09 


CH20COHLV 


1441 


346310 


RTA00002684F.d. 1 8. 1 .P.Seq 


F 


M00040I22D:A02 


CH09LNL 


1442 


189561 


RTA00002676FJ.09.3. P.Seq 


F 


M00039308B:G08 


CH09LNL 


1443 


403200 


RTA00002687F.j.24. 1. P.Seq 


F 


M00040318A:B02 


CH14EDT 


1444 


401413 


RTA00002685F.i.03.2.P.Seq 


F 


M00039530B:E02 


CH12EDT 


1445 


448680 


RTA00002690F.b.02.3.P.Seq 


F 


M00042440B:E09 


CH16C0P 


1446 


117060 


RTA00002679F.h.24. 1. P.Seq 


F 


M00039686C:C05 


CH09LNL 


1447 


403200 


RTA00002687F.j.24.2.P.Seq 


F 


M00040313A:B02 


CH14EDT 


1448 


448589 


RTA00002690F.3.07.3. P.Seq 


F 


M00042349D:D07 


CH16C0P 


1449 


373806 


RTA00002674F.O.02.I. P.Seq 


F 


M00039I79A:G09 


CH09LNL 


1450 


377055 


RTA00002682F.k. 13.1. P.Seq 


F . 


M00040005B:C1 1 


CH09LNL 


1451 


3731 1 1 


RTA00002670F.n. 14.2. P.Seq 


F 


M00033566C:E08 


CH09LNL 


1452 


12350 


RTA000027 1 3F.3.05. 1 .P.Seq 


F 


M00027195CE04 


CH04MAL 


1453 


450366 


RTA0000269 1 F.c.06.3.P.Seq 


F 


M00043344D:E04 


CH17COHLV 


1454 


397851 


RTA00002680F.b.04.2.P.Seq 


F 


M00039775A:A09 


CHO^LNL 


1455 


403200 


RTA00002687F.k.O 1 .2. P.Seq 


F 


M00040318A:B02 


CH14EDT 


1456 


403200 


RTA00002687F.k.0 1 . 1 P.Seq 


F 


M000403ISA:B02 


CH14EDT 


1457 


401 142 


RTA000026S7F.i.24.2.P.Seq 


F 


M00040313C:D0f 


CHI4EDT 




WO 01/02568 



PCTYUS00/18374 



SEQ 
ID 


CLUSTER 


ob(,) NA.vIb 


ORIENTATION 


CLONE ID 


r inn « n \s 

LIBRARY 


1458 


375221 


RTA00002679F.k. 19.1. P.Seq 


F 


M00039702A:B02 


CH09LNL 


1459 


403471 


RTA00002687F.a. U.l.p.Seq 


F 


M00039~49D:D05 


CHI4EDT 


1460 


12270 


RTA0000271 lF.f.23.1. P.Seq 


F 


M00023007C:E10 


CH03MAH 


1461 


401013 


RTA00002685F.0. 16.1.P.Scq 


F 


M0003964] A:A05 


CH12EDT 


1462 


74344 


RTA0000266 1 F.f. 1 0. 1 .P.Seq 


F 


M00003902A:C03 


CH01COH 


1463 


423432 


RTA00002687F.1. l0.2.P.Seq 


F 


M00040323C:G1 1 


CH14EDT 


1464 


423432 


RTA00002687F.1. 1 0. 1 .P.Seq 


F 


M00040323C:Gl 1 


CH14EDT 


1465 


379560 


RTA00002682F.g. 1 8. 1 .P.Seq 


F 


M0003998 1 A:E08 


CH09LNL 


1466 


122669 


RTA000027 1 2F. f.22. 1 .P.Seq 


F 


M00026357D:GI2 


CH04MAL 


1467 


373319 


RTAOO00267 1 F.c. 1 7.2.P.Seq 


F 


M00038259B:A02 


CH09LNL 


1468 


448034 


RTA00002690F.b.l6.2.P.Seq 


F 


M0004275 1C:C12 


CH16COP 


1469 


376366 


RTA00002677F.h.05.2.P.Seq 


F 


M0003939"B:H09 


CH09LNL 


1470 


452253 


RTA00002692F.F.04.2.P.Seq 


F 


M00043045D:GI2 


CHI SCON 


1471 


401601 


RTA00002685F.f.l8.2.P.Seq 


F 


M00039508C:GOI 


CH 12EDT 


1472 


373647 


RTA00002672F.d.04.1. P.Seq 


F 


M00038664C:E04 


CH09LNL 


1473 


37972 1 


RTA00002676F.b.20.2.P.Seq 


F 


M00039276B:H09 


CH09LNL 


1474 


446404 


RTA00002689F.e.02.3. P.Seq 


F 


M00042SS7C:D07 


CH15CON 


1475 


403738 


RTA000026S7F.a. l0.2.P.Seq 


F 


M00039-43A:FI 1 


CHI4EDT 


1476 


376887 


RTA00002674F f.23.2.P.Seq 


F 


M00039i;-5D:H02 


CH09LNL 


1477 


373787 


RTA00002677F.I.04.2.P.Seq 


F 


M0005^4|4D:G03 


CH09LNL 


1478 


401375 


RTA0O0O2685F.n. 04.1. P.Seq 


F 


M00039604B:E05 


CH12EDT 


1479 


401375 


RTA00002685F.n. 04.2. P.Seq 


F 


M0003960-*B:E05 


CH12EDT 


1480 


403232 


RTA00002687F.z.20.2.P.Seq 


F 


M000402I8C.C02 


CH14EDT 


1481 


403232 


RTA00002687F.g.20.l. P.Seq 


F 


M000402!8C:C02 


C.H14EDT 


1482 


449080 


RTA00002690F.a.04.2.P.Seq 


F 


M0004234-D:H1 1 


CH16COP 


1483 


430973 


RTA0O002669F.a.03.4.P.Seq 


F 


M00033 1 ~63:E12 


CH08LNH 


1484 


374742 


RTA00002676F.C. 1 2.2.P.Seq 


F 


M00039279B:C1 1 


CH09LNL 


1485 


44974 1 


RTA00002690F.e.23.2.P.Seq 


F 


M000428563:H02 


CH16COP 


1486 


45341 


RTA000027 1 OF.k. 1 9. 1 .P.Seq 


F 


M0002249Oa:B02 


CH03MAH 


1487 


451220 


RTA0000269 1 F.f.07.2.P.Seq 


F 


M00043J03B:DI 1 


CH17COHLV 


1488 


22067 


RTA00002708F.I-'. 12.1. P.Seq 


F 


M00004 M0D:C03 


CH0ICOH 


1489 


378952 


RTA00002683F.h.l 1.2.P.Seq 


F 


M000400"0B:B07 


CH09LNL 


1490 


401435 


RTA00002685F.n. 14.2.P.Seq 


F 


M00059t)0"D:E08 


CH12EDT 


1491 


375284 


RTA00002676F.g.2 1 .2.P.Seq 


F 


M00039298D:B04 


CH09LNL. 


1492 


449080 


RTA00002690F.a.04.3. P.Seq 


F 


M00042347D:H1 1 


CH16COP 


1493 


37897 


RTA0000266 1 F.b. 1 5. 1 .P.Seq 


F 


M0000I4 6B:GI0 


CH0 1COH 


1494 


7572 


RTA00002709F.h.03. 1 .P.Seq 


F 


M0000t>SO9B:B09 


CH02COH 


1495 


377076 


RTA00002682F.I". 14. 1 .P.Seq 


F 


M0003" l ?"D:D12 


CH09LNL 


1496 


374828 


RTA00002674F.m. 10.2.P.Seq 


F 


M0003°l T 0A:BI0 


CH09LNL 


1497 


400293 


K 1 AUUUUJoS^r.a. 1 '.J.r.ieq 


r 


ivl UUU_' ^ O j A . U V 


CYl 1 TPHT 
tw ri i _ t L/ 1 


1498 


401435 


RTA00002685 F.n. 14. 1 .P.Seq 


F 


M0003"oO"D:E08 


CH12EDT 


1499 


3746S0 


RTA00002676F.C. U.I.P.Seq 


F 


MOOOJ^-SOBOS 


CH09LNL 


1500 


399018 


RTA00002684F.d.20.2. P.Seq 


F 


mooo4oi::a.ao9 


CH09LNL 


1501 


376351 


RTA0000267SF.C. l9.2.P.Seq 


F 


M000:- l '45;C:G09 


CH09LNL 


1502 


19699 


RTA000027IOF.f.l8.l. P.Seq 


F 


M0002:i05C:C12 


CH03MAH 


1503 


3941 13 


RTA00O02665F.d. 1 5. 3. P.Seq 


F 


M0002S3 UD:F0f 


CH08LNH 


1504 


452652 


RTA00002692F.a. 16. 1 .P.Seq 


F 


M0004;M"C:DOI 


CHI SCON 



( o 0 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1505 


450791 


RTA0000269 1 F.b.23.3. P.Seq 


F 


M00043338BA03 


CH17COHLV 


1506 


201 12 


RTA000027 1 1 F.b. 1 6. 1 .P.Seq 


F 


M0O02283OD:DOI 


CH03MAH 


1507 


455142 


RTA00002694F.b.08.t.P.Seq 


F 


M0004343! D.B08 


CH20COHLV 


1508 


1 1 7060 


RTA00002679F.i.0 1 . 1 .P.Seq 


F 


M00039686C:C05 


CH09LNL 


1509 


447859 


RTA00002689Fd. 13. 1 .P.Seq 


F 


M00042737C:H04 


CH15CON 


1510 


452572 


RTA00002692F.e. 16.1. P.Seq 


F 


M00043034D:C01 


CHI SCON 


151 1 


448639 


RTA00002690F.a.06.3. P.Seq 


F 


M00042348B:E05 


CHI6COP 


1512 


37S947 


RTA00002683F.0. 12.2.P.Seq 


F 


M0004009SC:BOI 


CH09LNL 


1513 


403599 


RTA00002687F.L 12.2.P.Seq 


F 


M00040299B:F10 
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RTA00002667F.m.2 1 . 1 .P.Seq 


F 


M00032864B:B09 


CH08LNH 


1837 


374018 


RTA00002672F.a.l4.2.P.Seq 


F 


M00038632C.B09 


CH09LNL 


1838 


375409 


RTA00002678F.n.02.2.P.Seq 


F 


M00039616B:C01 


CH09LNL 


1839 


401 155 


RTA00002685F.0. 12.2. P.Seq 


F 


M00039630A:C08 


CH12EDT 


1840 


13958 


RTA000027 1 1 F.b.02. 1 .P.Seq 


F 


M00022817A:H02 


CH03MAH 


1841 


38767 


RTA00002687F.a.l 1.1. P.Seq 


F 


M00039748C:F1 1 


CHUEDT 


1842 


29398 


RTA00002663F.C.23. 1 .P.Seq 


F 


M00022015B:B07 


CH03MAH 


1843 


12453 


RTA00002709F.c.23.2.P.Seq 


F 


M00005556B:D02 


CH02COH 


1844 


38767 


RTA00002687F.a.l l .2.P.Seq 


F 


M00039748C:F1 1 


CHUEDT 


1845 


279885 


RTA0000267 1 F.f.05.2.P.Seq 


F 


M00038279C:AI 1 


CH09LNL 



WO 01/02568 
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1846 


188592 


RTA00002664F.C 1 8.2.P.Seq 


F 


M00027141C:H03 


CH04MAL 


1847 


376469 


RTA00002674F.h.06.2.P.Seq 


F 


M00039I40D:A04 


CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LlbKAK I 


1 


10600 


RTA0000ZS9 lF.j.07. I P.Seq^ 


F 


M00003753BD07 


CHOlCOH 


-> 


18827 


RTA00002900Fp.121.PSea 


F 


M00005413D.A05 


CH02COH 


3 


1759 


RTAO0OO2923F1.23. l.P.Seq 


F 


M0OO39248C:A08 


CH09LNL 


4 


10924 


RTA00002907F.k. 12. l.P.Seq 


F 


M0002222-lA:C07 


CH03MAH 


5 


45331 


RTAO0OO29O3F.I. 10. 1 .P.Seq 


F 


M00007037D:DIO 


CH02COH 


6 


42233 


RTA00002912F.2.24. l.P.Seq 


F 


M00027359B:A06 


CH04MAL 


7 


72L1 


RTA00002909F.h.06. l.P.Seu 


F J 


M00022634A:C07 


CH03MAH 


3 


21395 


RTA00002S90F.k. 16. l.P.Seq 


F 


M00001637D.C12 


CHOlCOH 


9 


3093 


RT AO0002923F.e.03. 1 P.Seq 


F 


M00039225A:Dll 


CH09LNL 


10 


15806 


RTA00002S94F.1.07. l.P.Seq 


F 


M0000399lA:Cll 


CHOlCOH 


11 


19739 


RT A00002S96F.d. L2. 1 .P.Seq 


F 


M00004147C:E01 


CHOlCOH 


12 


140879 


RTAC0002905F.C. 17. l.P.Sea 


F 


M000079S5C:D0S 


CH03MAH 


13 


29706 


RT A00002908F.I.22. 1 .P.Seq 


F 


MOOO22-l3"B:A0S 


C HO.:. VI AH 


14 


109581 


RTA000029 ISF.i.OS. l.P.Seq 


F 


M0O0329O8A:D0S 


CHOSLNH 


15 


25009 


RTA00002906F.k.l l.l.P Seq 


F 


M00022016B:F01 


CH03MAH 


16 


8328 


RT A00002SSSF.e.07. 1 P.Seq 


F 


M0000145lC:E10 


CHOlCOH 


IT 


15045 


RTA00002SS7F.e.06. l.P.Seq 


F 


MOOOOI393C:EOS 


CHOlCOH 


18 


21216 


RTA00002S9SF.p.22. l.P.Seq 


F 


M00004416B:G10 


CHOlCOH 


19 


185754 


RTA000029 12F.1.09. l.P.Seq 


F 


MOOO275O6B:G0l 


CH04MAL 


20 


11831 


RTA00002909F.h. 10. l.P.Seq 


F 


MOOO22b3SA:D03 


CH03MAH 


21 


185989 


RT A000029 lOF.h. 1 2. 1 .P.Seq 


F 


M00022924C:FO-i 


CH03MAH 


22 


9667 


RTA000O2923F.3.03. l.P.Seq 


F 


M00039!62D:C04 


CH09LNL 


23 


15817 


RTA00002903F.O.03. l.P.Seq 


F 


M00007103D:C02 


CH02COH 




10198 


RTA00002923F. j.09. 1 .P.Seq 


F 


M00039294C:BC9 


CHG9LNL 


25 


6355 


RTA00002S94F.p. !2. l.P.Seq 


F 


M00004055D:D05 


CHOlCOH 


26 


12227 


RTA00002909F.e. 18.1 .P.Seq 


F 


M00022601B:G06 


CH03M.AH 


27 


11047 


RT A00002S93F.O.06. 1 .P.Seq 


F 


M00003960D-.C12 


CHOlCOH 


2S 


1870 


RT A000029 lOF.m. OS. l.P.Seq 


F 


M00023020CH03 


CHOjMAH 


29 


20C65 


RTA0C00290SF.m.09. l.P.Sea 


F 


M00022-iSM.\:A0S 


CH03MAH 


30 


19454 


RTA0OO02900F.m.23. l.P.Seq 


F 


M00005.-"9A:D10 


CH02COH 


31 


48048 


RT A00002922F.m. 13.1 .P.Seq 


F 


M00039124D:HOl 


CH09LNL 


32 


19799 


RTA0000290SF.h. 19. l.P.Seq 


F 


M000224-9D:F0S 


CHOJMAH 


33 


185562 


RT A000029 1 1 F.m.07. 1 .P.Seq 


F 


M0OO27093A:H02 


CH04MAL 


34 


24214 


RTA00002S9lF.k. 19. l.P.Sea 


F 


M00003~64D:F0" 


CHOlCOH 


35 


5172. 


RTA0000290SF.p.22. 1 .P.Seq 


F 


M00022525B:D09 


CHOjMAH 

v T /"V a /"^ /~\ T'T 


36 


50495 


RT A00002 89 SF.c . 1 6. 1 . P. Seq 


F 


M0000432lC:Cl! 


CHOlCOH 


37 


43287 


RT A0000290SF.k. 1 6. l.P.Seq 


F 


M000224~0D:B02 


CH03MAH 


38 


15324 


RTA0O002905F.p.20. l.P.Seq 


F 


MOOOZltO-CiBO" 


CH03MAH 


39 


22157 


RTA000023SSF.2.07. l.P.Seq 


F 


MC0001461D:B10 


CHOlCOH 


40 


15249 


RTA000029 ! 5F.1.0S. 1 .P.Seq 


F 


M000324$9B:G!2 


CHOSLNH 


41 


2764 


RTA00002925F.C.1 1. l.P.Seq 


F 


M00039S29B:EOL 


CHC9LNL 


42 


23338 


RTA00002SS9F.b.U. l.P.Seq 


F 


M000015 ;SB:DiO 


CHOlCOH 


43 


11074 


RTA00002S99F. a. 22. l.P.Seq 


c 
r 


vioooo.iro T C 10 


CHOiCOH 


44 


18367 


RTA000029:2F.b.09. 1 .P.Seq 


F 


M0003Se'.9D:CL2 


CH09LNL 


45 


21703 


RTA00002903F.rn.OS. l.P.Seq 


F 


M0O0O"O59B:D0" 


CH02COH 


46 


21470 


RTA00002S95F.C. 14. l.P.Se_g_ 


F 


M000040o7B:D03 


CHOlCOH 


47 


15492 


RTA00OO29O"F.p.0b. l.P.Seq 


F 


mooo:::s:b:C09 


CHO.-MAH 


48 


4022 


RTA0000:S9"F.i. 22. l.P.Seq 


F 


M 00004 IcOB: BO- 


CHOlCOH 


49 


21579 


R T AOOOO: i F.s.03. 1 .P.Seq 


F 


M00001oSbB:K01 


CHOlCOH 


50 


IS62S3 


RT A00G029 ! }r.c.Ct>. ! .P.Seq 


F 


1 M0002"S.:.IB:DO* 


CHO-iNLAL _ 
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ILf 






ORIENTATION 


CLONE ID 


LIBR.ARY 


51 


5410 


RTA00002Vilr.i.U8. l.r.Seq 


r 


IVIUUVJ J J*+*+ J 1—/. VJV/J 


THOQI NT 


52 


22420 


RTA0000290 1 F.e. 19.1 .P.Seq 


r 






53 


140553 


RTA000029 loF.n.U2. 1. P.Seq 


r 


iviuuu J— O JO O . l\J— 


CH0SLNH 

J 1UU till 


54 


23849 


RTA00002887F.a.22. 1 .P.Seq 


r 




CH01COH 


55 


21945 


RTA00002893F.1. 22.1. P.Seq 




Mfinnnj. i rnc* Pin 

IVlUUiJl-/** 1 UjL.. C 1U 


CH01COH 


56 


7867 


RTA0000290 lF.p.08. 1 .P.Seq 


c 




CHO^COH 


57 


14533 


RTA00002S96F.1.U 1 . 1 .r.Seq 


r 


MOnflOtl 1 7Qf"-B06 


CH01COH 


58 


5790 


RTA000029 19F.g. 17. 1 .P.Seq 


p 
r 


IVl wv J J vOU\. . r\U / 


CH08LNH 


59 


186153 


RTA000029I 1F.I.24. 1. P.Seq 


c 

r 


MnfHY>70 1 7 A • R 09 


CH04MAL 


60 


10561 


RTA00002S99F.n.O8. 1 .P.Seq 


c 
r 


MnnnfiiiisnAn- H09 


CH01COH 


61 


24572 


RTA0000289jF. 1. 08.1. P.Seq 


r 


IV 1 Uuw _) V — vj ,-\ . 1 l l 


CH01COH 


62 


13138 


RTA00002888F.m.03. 1 .P.Seq 


r 


IVlUUUv l*+OOL . r\\JJ 


CH01COH 


63 


6701 


RT A00002922 F.. 2. 18.1. P.Seq 


r 




CH09LNL 


64 


12751 


RTA00002904F.C. 10. 1 .P.Seq 


c 
r 


Mnnnni7''(y r • fo t 


CH02COH 


65 


3583 


Rl A000029 1 6F.n.2 1 . 1 .P.Seq 


p 
r 




CH08LNH 


66 


12673 


RTA00002901F.d.24. 1. P.Seq 


r 




THO^COH 


67 


15243 


RTA00002901F.I.2 1 . 1 P.Seq 


r 


IVIUUUUJO — J D .uU 1 


CHO^COH 


68 


21022 


RTA00002922F.k.24. 1. P.Seq 


r 


1V1UUUJV 111 * — 


THOQI Nl 

V_ 1 ivy !7 L^l 1 


69 


36596 


RTA000029 19F.g.24. 1 P.Seq 


r 


N/fnnn*^ns i n-o i i 

IV1UUUJ JUO L LJ -U 1 l 


CHOSLMH 

i 1 v U ft— * " ft * 


70 


4932 


RTA00002890F.C. 14. 1 .P.Seq 


r 


IVIUUUU iJVOrt.L'U- 


TH01COH 


71 


42413 


RTA00002900F.0. 14. 1 .P.Seq 


r 


lvlUUUUJ'+'U If.ru? 


CHCCOH 


72 


1090 


RTA000029 1 8F.g.20. 1 .P.Seq 


r 




CH08LVH 


73 


44737 


RTA0000290lF.a.20. 1. P.Seq 


r 


Mnnnn^ a rn^ 

1V1UuUU_/'t J*+.-\ . v_ w-> 


CHCCOH 


74 


4183 


RT A000029 1 8 F. n . 23.1. P.Seq 


r 


1VIUUUJ-700D .VJVl 


CH0SLMH 


75 


41882 


RTA00002902F.d. 12.1 .P.Seq 


r 


1V1 UwwUJ O \J *~s ■ IS 


CH02COH 


76 


500 


RTA0000292DF.O. la. 1 P.Seq 


c 
r 


M0OO400 i4A'F06 


CH09LNL 


77 


5435 


RTA0000292 lF.t\20. 1. P.Seq 


t: 
r 




CH09LNL 


78 


15829 


RTA00002900F.|.01. 1. P.Seq 






CH02COH 


79 


154083 


RTA0000290 ,■ F.a.06. 1 .P.Seq 


r 




CH03MAH 


80 


24381 


RTA00002910F.1. 16. 1. P.Seq 


r 


M noo^ 7 Q > l R • D06 


CH03MAH 


81 


107940 


RTA00002930F.f.07. 1. P.Seq 


c 
r 


K/ronn^ S7 ^ s a - hos 


CH15C0N 


i 82 


24761 


RTA00002902F.I.2 1 . 1 .P.Seq 


r 




CH02COH 


83 


10734 


RTA00002924F.e.02. 1 .P.Seq 


r 




CH09LNX 


84 


40540 


RTA00002897F.p.2j. 1 .P.Seq 


r 


m nooo4 "tfn c* • c 05 


CH01COH 


85 


23692 


RTA000029 jOF.i.07. 1 .P.Seq 


r 


^1000^60570^06 


CH15CON 


86 


7896 


RTA00002906F.|.OS. 1. P.Seq 


r 


MOOfP IQOSR-D09 


CH03MAH 


87 


24387 


RTA00002a9or.e.uy. l.r.Seq 


c 


M00004 1 5 1 B ■ \07 


CH01COH 


88 


2420 


RTA00OO2S89F.n.O2. 1 .P.Seq 






CH01COH 


89 


10431 


RTAOUUU^ao /r.p.U/. l.r.oeq 


c 


M (TOO0 1 4"' 9 B • G 05 


CH01COH 


90 


14665 


RT A0000290o r . 2 .U / . 1 . r - oeq 


■c 
r 


Nif finfl'' ''4'' T A C 09 


CH03MAH 


91 


10302 


RTA00002906F.O.03. 1 .P.Seq 


F 


M000220S1A:B07 


CH03MAH 


92 


28436 


RTAO00029OSF.f.08. l.P.Seg_ 


F 


M00022415C:D12 


CH03MAH 


93 


17829 


RTA00002SS9F.?i. 1 1 - 1. P.Seq 


F 


M00001544B:E06 


CH01COH 1 


94 


10390 


RTA00002906F.e. 13.1 .P.Seq 


F 


M00021923A:B 12 


CH03MAH 


95 


11619 


RTA00002913F.C.07.1. P.Seq 


F 


M00027S06C:H05 


CH04MAL 


96 


6890 


RTA000029 1 SF.m. 1 9. 1 .P.Seq^ 


F 


M00032979D:H07 


CH0SLNH 


97 


10110 


RTA00002897F.C. 13.1. P.Seq 


F 


M00004:45C:G10 


CHOICOH 


98 


2151 1 


RTA00002S92F.h.24.2.P.Seq 


F 


M00003S21C:E 12 


CH01COH 


99 


92 37 


RTA00002S99Fh.l4.l.P.Seq 


F 


M0000460SA:C10 


CHOICOH 


100 


16575 


RTA00002S95F.n.02. L. P.Seq 


F 


M000041 10D:F09 


CHOICOH 
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SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBR.\RY 


i n i 


1UOJ / 


RTAf)nO(PS99F h 17 1 P Sea 


F 


M00004609A:E09 


CH01COH 


lUi 




RTAOO0O' , S9lF d 14 1 P Sea 


F 


M000O3787D:AlO 


CHOICOH j 


IUJ 




RTAfinOfPSQfiF f"U I P Sea 


F 


M00004153B:E03 


CH01COH 


1 CiA 


sn70 

ou/u 


RT A0f)fi(V> Q '' ,, SF c 09 1 P Sea 

IV 1 ."\UUUU — 7 — .J I . V 7 . i ■ A • iJfcV^ 


F 


M00039828B:H06 


CH09LNL 


i ns 


1J0OU 


RT Afinnn7SS7F o ll l P Sea 


F 


M00001397C:H08 


CHOICOH 


lUO 




RTAnnnfPOOl F a 10 1 P Sea 


F 


M00005500A:D04 


CH02COH 


1 CM 


QQAI 


rt Aonnn^oo^F 1 0 1 1 P Sea 


F 


M00007046D : C09 


CH02COH 


luo 


77DO 


RTAnnnn")Q 1 1 F o ''O 1 P Sea 


F 


M0OO27168B:H08 


CH04MAL 


l no 


1 / J u 


RTAflfiOfPQOfiF d 03 1 P Sea 


F 


M00021896D:A05 


CH03M.AH 


1 1 0 

1 IU 


ZH0 J J 


RTAf)nO<Y> Q '74F f 18 IP Sea 


F 


M00039554D:B09 


CH09LNL 


1 1 1 
111 


i sonn 


RT AOflOfPS 0 1 F i 08 1 P Sea 
Ix i ^xuuuu — o7iri|iUUii<i • 


F 


M00003758B:F06 


CHOICOH 


1 1 7 
1 1Z 




RTAflnn0'>9ft'SF e ">3 1 P Sea 


F 


M00008020D:D05 


CH03MAH 


1 1 1 
1 1 J 


j / SO 


RTAnonfPQOI F e OS I P Sea 


F 


M00005466C:B0l 


CH02COH 


1 1 4 * 


1 SA 10 1 
1 J 4 * I — 1 


RT AnnnO^OnfiF n O'' 1 PSea 


F 


M00022088B:F10 


CH03MAH 


1 1 ^ 


<7A A 

J /40 


RTAnnon^Q i rf 1 no i p Sen 


F 


M00032945D:B07 


CH08LNH 


1 t A 

110 


JJ/UU 


DTAnnfW"PQfl 1 F p- DO 1 P Sen 


p 


M00005468A:C04 


CH02COH 


11/ 


JOOU 


K IrtUUUU— oyur.D. I i . 1 .r - oCL( 


p 


Mf)000 1 59 1 B H05 


CHOICOH 


1 1 Q 
1 lo 


ZZ / JZ 


Ix 1 rtwyu-7i-tr 1 U — ? - 1 .r . jcl| 


p 


M00039785C:H12 


CH09LNL 


1 1 O 

i iy 


14/ZU 


dx a nnnn^s qop i ns 1 p ^ph 
ix 1 .AUUuy»o7ir.].yj. 1 .r -ocl| 


p 


M000038''5AH10 


CHOICOH 


1ZU 


1 JO JO 


dt Annnn^^QftP r Od. 1 P v>n 

ix 1 rtwUU-u7ur,t 1 .r. jcl^ 


p 


M00004 1 4 3 B : B 04 


CHOICOH 


lz 1 


Zj I JU 


dt Annnn">^R7P n ni 1 p N>n 

tx 1 nUUUU- 00 / r.U'\J 1 • 1 .r .jcl| 


p 


M0O00 1 3 84 A : A07 


CHOICOH 


ion 


1 I OTA 

1 Iv/U 




p 


MQ000700''C • A 1 0 


CH02COH 


IZJ 


lUOoO 


DTAnnnn^o i sp r» n? P 9H»n 


p 


M0003' > 5 19D-F08 


CH08LNH 


lz4 


QCQQ 


dt a nnnn^Q^ ip m in 1 p 


p 


M00039331BF09 


CH09LNL 


l 0< 
IZJ 


ojUU 


DTAnnnn^Q^SF *> 1 1 P 


p 


M00039860D-B0" 7 


CH09LNL 


1 OA 

izo 


OOl J 


pTAnnnn^on7F 1 17 1 p spn 

tx 1 aUUUU-7U / r. 1 . I / . 1 .r .JClj 


p 


M0002223SC:G04 


CH03M.AH 


1 n 
lz / 


/ jZ4 


dt Annnn^s^^F p \ fi 1 p ^pn 

ix l nUUw-OOOr.c. 1 vj- i .r .jclj 


p 


M00001348B:B03 


CHOICOH 


1 OS 


lOS 
jZj 


rt Annnn^o i ^f ct n*^ 1 p Sen 


F 


M0OO27332B:H09 


CH04MAJL 


1 oo 
Izv 




DTAHOnn^^^OF nfll 1 P Sen 


F 


M000O1570A:B07 


CHOICOH^ 


i in 


O "1x7,1 
Z JJ J*+ 


RTAnnnn^ssoF c ^ 1 p Sea 


p 


M00001533D:A01 


CHOICOH 


1 j l 


/**/ J 


RT Annnn°soiF n ^0 1 P Sea 


p 


M00003965D:D1 1 


CHOICOH 


I JZ 


1 QxA7s^ 
1 o jOZj 


RT AnnOO^O I ^F f 1 0 1 P Sea 

IX 1 ."\UUuU-7 I ^ X ■ I ■ 1 V. l.t . OtL) 


F 


M00027314D:E02 


CH04M.AL 


1 J j 


ioon 


RTAnnnn^O 1 7F mfl7 1 P Sea 


F 


M0O032773D:FO8 


CHOSLNH 


i j*+ 


04j0 


RT Annnn^SSQF m 0° I PSea 


F- 


M00001562D:B07 


CHOICOH 


nc 
1 jj 


ZUZOJ 


RTAnnnn^on^F n ns i p Sen 


F 


M00022073C.C07 


CH03MAH 


1 JO 


1 0014 1 


RTAnnnn°Q 1 ^F f 04 I PSea 


F 


M00027311A:H09 


CH04M.AL 


1 J / 


A^^l 
40JZ 


RTAnnnn^Q 1 OF i !^ 1 P Sea 
IV I rtUUUU-7 1 7r.i - 1 j. i .r - Jtij 


F 


M00033150B:E02 


CHOSLNH 


1 IS 
Ijo 


0 1/1 A 
Z 140 


RTAnf)f)n"'>3''( : iF n ~> 1 1 PSea 

IV 1 rtUOvw- 7-UI . «1. — L. • L.X . JtL| 


F 


M00040061C:C08 


CH09L.VL 


1 1Q 
I JV 


J JZZ 


RTAnnnn n S07F h ^3 I P Sea 


F 


M00004263C:D03 


CHOICOH 






RTAOnOO" , 909F ail I P Sea 


F 


M0O022627B:H03 


CH03MAH 


141 


15650 


RTA00002925F.2.04. l.P.Seq 


F 


M00039S74A:B06 


CH09LNL 


142 


21031 


RTA0000290 1 F.h.2 1 . 1 .P.Seq 


F 


M00005520B:H05 


CH02COH 


143 


95610 


RTA00002909F.h. 18. l.P.Seq 


F 


M00022642A:G08 


CH03MAH 


144 


903 


RTA000029 12F.O.03. 1 .P.Seq 


F 


M00027591A:E04 


CH04M.AL 


145 


17284 


RTA00002916F.k. 18. l.P.Seq 


F 


M00032620B:F06 


CHOSLNH 


146 


15556 


RTA0000:S95F.m.0 1. l.P.Seq 


F 


M00004104A:A12 


CHOICOH 


147 


11013 


RTA00002S97F.b.0S. l .P.Seq 


F 


M00004214D:A05 


CHOICOH 


148 


15358 


RTA00002903F.n. IS. l.P.Seq 


F 


M0000709SA:EI0 


CH02COH 


149 


10792 


RTA0000:S94F.a. 10. 1 .P.Seq 


F 


M00003974C:E11 


CHOICOH 


150 


25507 


RTA0000290 1 F.n.07. 1 .P.Seq 
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CH04MAL 


463 


4063 


RTA000029 16FJ.24. 1 .P.Seq 


F 


M00032613A:E11 


CH08LNH 


464 


6267 


RT A000029 1 0F.d.20. 1 .P.Seq 


F 


M00022835C:A09 


CH03MAH 


465 


21349 


RTA00002901F.C.04. 1. P.Seq 


F 


M00005445D:F11 


CH02COH 


466 


1123 


RTA00002894F.i.24. 1 .P.Seq 


F 


M00004031C:G06 


CHOICOH 


467 


4401 


RTA000029 ISF.m. 18. 1 .P.Seq 


F 


M00032979D:Cll 


CH08LNH 


468 


15255 


RTA00002925F.p. 10. 1 .P.Seq 


F 


M00040041A:G08 


CH09LNL 


469 


10991 


RT A000029 3 3 F.a. 1 5 . 1 . P . Seq 


F 


M00043077C:G10 


CH19COP 


470 


48768 


RTA00002S86F.m.24. 1. P.Seq 


F 


M00001374C:B10 


CHOICOH 


471 


20406 


RTA00002900F.C.20. 1 .P.Seq 


F 


M00004852D:C06 


CH02COH 


472 


39784 


RTA00002SS6F.2.05.1. P.Seq 


F 


M00001352B:B02 


CHOICOH 


473 


36567 


RTA00002886F.n.06.1.P.Seq 


F 


M00001375B:D04 


CHOICOH 


474 


14817 


RTA00002902F.a.l8.1.P.Seq^ 


F 


M00005771D:C02 


CH02COH 


475 


156277 


RTA00002907F.1. 13.2.P.Seq^ 


F 


M0O022237D:D06 


CH03MAH 


476 


6898 


RTA00002907F.a.22.1. P.Seq 


F 


M00022104A:G08 


CH03iVLAH 


477 


17376 


RTA00002902F.C.03. 1 .P.Seq 


F 


M00005819D:F09 


CH02COH 


478 


186535 


RTA000029 1 2F.d. 12. 1 .P.Seq 


F 


M00027270A:D04 


CH04MAL 


479 


91616 


RTA000029 1 OF.b.24. 1. P.Seq 


F 


M00022S12A:G01 


CH03MAH 


480 


91616 


RTA000029 lOF.c.01. l.P.Seq^ 


F 


M00022S12A:G01 


CH03M.AH 


481 


6993 


RTA00002S96F.j. 12. 1 .P.Seq 


F 


M00004172D:F04 


CHOICOH 


482 


12443 


RTA000029 16F.a.20. 1 .P.Seq 


F 


M00032534BE12 


CH08LNH 


483 


28585 


RTA00002901FJ. 16.1. P.Seq 


F 


M00005570A:D05 


CH02COH 


484 


9453 


RTA00002907F.k.2 1 2.P.Seq 


F 


M0002222SB:B11 


CH03M.AH 


485 


156009 


RTA00002907F.k.05.2. P.Seq 


F 


M00022220A:A07 


CH03M.\H 


486 


5958 


RTA00002908F.n.22.2.P.Seq 


F 


M00022507C:C08 


CH03M.AH 


487 


155939 


RTA00002907F.j.23.2.P.Seq 


F 


M0002221SB:B12 


CH03MAH 


488 


16695 


RTA00002SS6F.g.22.1. P.Seq 


F 


M00001353D:E05 


CHOICOH 


489 


10118 


RTA0000:SS6F.h.l8.1.P.Seq 


F 


M0OO01356DE06 


CHOICOH 


490 


13288 


RTA00002930F.b.2 1 . 1. P.Seq 


F 


M00042S91C.G08 


CH15CON 


491 


3210 


RTA000029 lOF.h.22. 1 .P.Seq 


F 


M00022945A.H09 


CH03M.AH 


492 


15014 


RTA00002934F.a. 18.1. P.Seq 


F 


M0004352SA:E11 


CH20COHLV 


493 


22087 


RTA0000:S90F.i.l9.l.P.Seq 


F 


M00001624A:C01 


CHOICOH 


494 


31948 


RTA0000:90SF.i. 12.1. P.Seq 


F 


M00022454C:B08 


CH03NL\H 


495 


11593 


RTA00002906F.P.2 1 . 1 .P.Seq 


F 


M00022094B:G02 


CH03MAH 


496 


3131 


RT A00OO:9OSF.m. 17. 1. P.Seq 


F 


M00022494B:D06 


CH03.\L\H 


497 


151263 


RTA000O29O6F.i.2l.l.P.Seq 


F 


M00021991D:F09 


CH03MAH 


498 


177542 


RTA00002910F.h.23.1. P.Seq 


F 


M00022945B:FU 


CH03M.AH 


499 


9738 


"RTA00002924F.f.23. 1 P.Seq 


F 


M00039559B:C07 


CH09LNL 


500 


15313 


RTA0000:925F.f.05. 1 .P.Seq 


F 


M00039S65A:C09 


CH09LNL 



[IS 



WO 01/02568 
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SEQ 
ED 




SEQ NAME 


ORIENTATION 


CLONE ID 


LIBR.\RY 


501 


19724 




F 


M00022088D:E10 


CH03MAH 


502 


10731 


dt Annnn^9Q^P m 1 1 1 P Sen 


F 


M00003938C:A05 


CH01COH 


503 


10257 


DT Aonnmoni p u no, i P Sen 


p 


M00005512B:H0l 


CH02COH 


504 


186468 


dt a n^nnoo i "ic k iq i p Sen 
R 1 AUUUU— y I Jr.o. l o. i .r .ocq 


F 


M00027746A:D06 


CH04MAL 


505 


14736 


DTAnnnA")Q(1QC r? T7 1 P Sen 


F 


M00022436C:F11 


CH03MAH 


506 


33267 


DT A IVWY1QQQP K 14 1 P Sen 


F 


M00001548B:D06 


CH01COH 


507 


7719 


DT a rfV\n.nOQflQP #» 1 1 1 P Sen 
K 1 AUUUU-yUor.C. I 1 . i .r.ocq 


F 


M00022403CE12 


CH03MAH 


508 


185539 


DT A 1 TP K (Y\ 1 P Sen 


F 


M00027717C:C06 


CH04MAL 


509 


14825 


dt \ nnnnoQO'iP f iq | P Sen 
K I AUUUUiy— *+r.i. i i -r.jcq 


F 


M00O39556C:G05 


CH09LNL 


510 


39 17 


DTAnnnn^on^F n h 1 P Sen 
K 1 AUUUU-yuor.p. i j. i .* .-jc\^ 


F 


M00022092D:A1 1 


CH03MAH 


511 


18718 


dt \nnrtfY7<lQ\P h OS I P Sen 


F 


M00004085B:H02 


CH01COH 


512 


186762 


dt a nnnniQ i Qp k 1 1 IP Sen 
K 1 AUUUU-7 1 or.o. 1 1 - 1 -i 


F 


M00032831A:C07 


CH0SLNH 


513 


2732 


DTAnnnmoTSP i 07 1 P Sen 
K 1 AUUUU-y— jr.l.u / . i.r .ocq 


F 


M00039900B:G04 


CH09LNL 


514 


7684 


R 1 .AUUUU-Vi^r-J- 1 /. l-tr. oeq 


F 


M00039668C-.F01 


CH09LNL 


. 515 


6852 


dt a /vwyjOIOP r- 00 1 P Sen 


F 


M00039001A:B10 


CH09LNL 


516 


1422 




F 


M00039433C:E03 


CH09LNL 


517 


5560 


RTAQOUUiy 1 /r.i.Ul. l.r.oeq 




M00032734C-.C05 


CH08LNH 


518 


48734 


RTAUUUUiyUOr.l.^J- i-r .oeq 


F 


M00022487C:C02 


CH03MAH 


519 


10486 


RTAOOUU-ioyyr.g.U / . I .r.oeq 


F 


M00004509B:B10 


CH01COH 


520 


33514 


R I AUUUU-oyur.J.UJ. l -.r.ocq 


F 


M00001626A:D07 


CH01COH 


521 


582 1 


nT \ A/W\"*Q 1 "7C m D 1 IP ^r»n 

R 1 AUUuU_y I /r.m.u I . i .sr .ocq 


F 


M00032772D:D03 


CH08LNH 


522 


5821 


RTAOOUU^yi /r.l.i4. l.f.oeq 


p 

r 


M00032772D:D03 


CH08LNH 


523 


21940 


RTAOUUUj.oyor .a.Uj. l .r .ocq 


F 


M00004134A:A08 


CH01COH 


524 


185724 


RTAOOOU-y 1-Sr.m.Uo. I .rocq 


F 


M00027523A:H05 


CH04M.AL 


525 


182887 


ot \ nnnnio l AC VOX IP ^en 


F 


M00022992A:H06 


CH03MAH 


526 


21346 


R 1 A OuUU-iyUlr. 1 .r .oeq 


F 


M00005507B:A03 


CH02COH 


527 


5501 


R L AUUUUj-oo /r.n. i..r .ocq 


F 


M00001424B:H06 


CH01COH 


528 


13961 


R 1 AUUUU_oy-rvj. i 4 *- 1 ■r.ocij 


F 


M00003828A:D1 1 


CH01COH 


529 


16784 


dt \ nnnn~WQ HO 1 P Sen 
R 1 AUUUU-ooOr.a.UV- 1 .r .ocq 


F 


MG000133SD:DOl 


CH0LCOH 


530 


17628 


dt a A/vini Q I AP fin 1 P <spn 
R 1 AUUUU-y lOr.l.lU. i.r .ocq 


F 


M0OO32568B:F08 


CH08LNH 


531 


3304 


dt a rtonn*^ "3QQP A fi^ 1 P Sen 


F 


M00004324A:B03 


CH01COH 


532 


14895 


DTAfinfin">Qn 1 P 0 id. 1 P St»n 


F 


M00005504C:F12 


CH02COH 


533 


16036 


dt a nnnn'^ 1 P lr (10 1 P St>n 


F 


M00003763B:B10 


CH01COH 


534 


23877 


dt A/W^n^QO IP If 1 S 1 P ^t»n 


F 


M00003764B:F11 


CH01COH 


535 


186784 


DTAnnnAiQ^fiF i 17 1 P 9en 
K 1 AUUvU_7Jur.i. 1 / . 1 .r.oci| 


F 


M0O056105A:DO6 


CH15CON 


536 


13591 


dt \ nnnn^on 1 P f I S 1 P ^*»n 
R 1 AUUUU-yuir.i- IJ.i-r .ocq 


F 


M00005485C:H04 


CH02COH 


537 


17916 


DT A nnnn" , QOftP n OR I P Sea 


F 


M00022090B:A10 


CH03MAH 


538 


40594 


K I AUUUU— oy / r.l. ID. 1 .r .ocq 


F 


M00004266B:F07 


CH01COH 


539 


9677 


dt \ Annn^oo^P i 0 1 IP *sr»n 


F 


M0O039915B:E08 


CH09LNL 


540 


77 jo 


dt a nnnn "*S97P »» I P Sen 


F 


M00001393B:C03 


CH01COH 


541 


2474 


RTA00002917F.e.l5.1.PSeq 


F 


M00032707D:F08 


CH08LNH 


542 


23810 


RTA00002892F.i.06. 1 .P.Seq 


F 


. M00003S22D:A02 


CH01COH 


543 


24633 


RTA00002907F.i.l9.1.P.Seq 


F 


M00022208B:D03 


CH03MAH 


544 


720S1 


RTA00002925 F.k.03 . 1 . P.Seq 


F 


M000?9929D:H10 


CH09LNL 


545 


5991 


RTA0000:916F.i.l7.1. P.Seq 


F 


M00032597A:H02 


CH08LNH 


546 


14596 


RTA000029 1 lF.n. 15. 1 .P.Seq 


F 


M00027131A:B03 


CH04M.AL 


547 


6923 


RTA0000:S96F.d.01.1. P.Seq 


F 


M00004146B:E08 


CH01COH 


548 


6923 


RTA0000ZS96F.C.24. 1 .P.Seq^ 


F 


M00004 U6B:E08 


CH01COH 


549 


21S51 


RTA00002S87F.ci.09. 1 .P.Seq 


F 


M00001391D:D03 


CH01COH 


550 


3935 


RTA00002925F.j.08. 1 P.Seq 


F 


M0005992IC.H11 


CH09LNL 



119 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


JJ I 


13328 


RTA00002909F.h.08. l.P.Sea 


F 


M00022634B:H09 


CH03MAH 


JJ- 


2492 


RTA00002SS7F.e. 1 1. 1 .P.Seq 


F 


M00001393D:E02 


CHOICOH 


J J J 


1 1960 


RTA00002917F.b.03. l.P.Seq 


F 


M00032671B.D08 


CH08LNH 


J J** 


I ouvot 


RTA000029 12F f. 18. 1. P.Seq 


F 


M00027319D:F07 


CH04MAL 


JJJ 


1 

1 JU't'l 


RTA00002925F.a.09. 1. P.Seq 


F 


M00039805B:B06 


CH09LNL 


JJO 


S707 

J / V / 


RTA00002909F.k. 1 3 . 1 .P.Seq 


F 


M00022672C:H04 


CH03MAH 




95700 


RTA0000291 lF.p-14.1.P.Seq 


F 


M00027182B:G06 


CH04MAL 


JJO 




RT A00002922F. i .23 . 1 .P .Seq 


F 


M00039076D:G04 


CH09LNL 


JJ7 


JUS f 

OHO I 


RTA00002887F.C. 12. 1. P.Seq 


F 


M00001389D:D06 


CHOICOH 




1 **S7S 


RTA000029 l6F.i. 12. l.P.Seq 


F 


M00032594C:F05 


CH08LNH 


JO I 


407 17 


RT A0000292 1 F.d.08. 1 .P.Seq 


F 


M00033359C:H05 


CH09LNL 


^A7 


1U / oo 


RTA000028S6F.d.24. 1 P.Seq 


F 


M00001346B:G11 


CHOICOH 


jOj 


JO/Ol 


RTA00002889F.k.23. l.P.Seq 


F 


M00001559A:H09 


CHOICOH 


jo4 


*t7Qfl 


RTA00002888F.2.08. l.P.Seq 


F 


M00001461D:CIO 


CHOICOH 


jOj 


i ni A7 


RTA00OO2916F.k.22: 1 .P.Seq 


F 


MOO032621A:Fll 


CH08LNH 


566 


13706 


RT A00002905F.e.2 1 . 1 .P.Seq 


F 


M00008019B-.A01 


CH03MAH 


567 


124172 


RT A00002900F . a . 09 . 1 .P.Seq 


F 


M00OU4a24A.L» 


PJJ09COH 


DOS 




RTA0OOC910F o P l.P.Seq 


F 


M00022904C:D04 


CH03MAH 




JOJU 


RTAnf)00''9l6F i 09 l.P.Seq 


F 


M00032605B:D09 


CH08LNH 




1 J 1 J4 


RT A.0OOO2S86F. p. 13. l.P.Seq 


F 


M00001382D:A07 


CHOICOH 


571 


25813 


RTA000029 lOF.i. 12. 1 .P.Seq 


F 


M00022952A:B02 


CH03MAH 


572 


17268 


RTA000O2886F.d.07. 1 .P.Seq 


F 


M0000 1 j44L>:£Uo 


V_ n \J l v~ \J n 


J / J 


1 jOo-4 


RTA000029 15F.j.09. l.P.Seq 


F 


M0003l485B:G05 


CH08LNH 


574 


13460 


RTA0000289SF.t. 19. l.P.Seq 


F 


M00004341C:A09 


CHOICOH 


575 


25115 


RT A000029 1 9 F . p . 18.1 .P.Seq 


F 
F 


M0000S016B:E09 


TH08LVH 
CH03MAH 


576 

S77 


19949 


RTA00002905F.e. 17. l.P.Seq 
RTA00002917F.k.06. l.P.Seq 


F 


M00032759A:A03 


CH08LNH 


578 


8243 


RTA00002901F.0. 17. l.P.Seq 


F 


M00005703B:E03 


CH02COH 


J ly 


i— j / u 


RTA00002900F.k.23. l.P.Seq 


F 


M00005359B:BOS 


CH02COH 


<on 
joU 


ZoJJ 1 


RTA00002909F.C.04. l.P.Seq 


F 
F 


M00022559D:G10 
M00004054A:D03 


CH03MAH 
CHOICOH 


581 
582 


15153 
9498 


RTA00002894F.O.2 1 . 1 P.Seq 
RTA00002894F.e.04. 1 .P.Seq 


F 


n/vnn :oc < r~*.. Rm 
MUUUUjy o j 1-* ■ D U_ 


CHOICOH 


583 


48140 


RTA000029 14F.h. 1 3. 1 .P.Seq 


F 
F 


M0002S21 IA:F10 

M00UU4UO 1 ts . tuJ 


CH0SLNH 
PHOICOH 


584 
585 
586 


7626 
22668 
45691 


RTA00002895F.b.04. 1 .P.Seq 
RTA00002S96F.p. 17.1.P.Seq 
RTA00002908F.a. 1 1 . 1 .P.Seq 


F 
F 


M00004204C:H08 
M00022305A:B04 


CHOICOH 
CH03MAH 


Da 1 

588 


46969 


RT A00002904F.a. 1 9. 1 .P.Seq 
RTA00002909F. »02. l.P.Seq 


F 
F 


M00007155D.C09 
M00022618C:E04 


CH02COH 
CH03MAH 


589 


44030 


RTA00002900F.O.23. 1 .P.Seq 


F 


M00005405C:D01 


CH02COH 




1 X"> vlR 

1 *T — J*tO 


RTA00002905F.h. 10. l.P.Seq 


F 
F 


M00008073A:D01 
MO00O8O59B:FO8 


CH03MAH 
CH03MAH 


591 
592 


18455 
7501 


RTA00002905F. a. IS. l.P.Seq 
RTA00002S94F.<z.05. 1 .P.Seq 


F 
F 


M0000j99jD:BOj 
MO00O3959D:AO5 


run I fflH 

CHOICOH 


593 
594 
595 
596 


72S0 
t 19339 
30194 
32650 


RTA00002S93F.n.22. l.P.Seq 
RTA00002S98F.1. 12.1 .P.Seq 
RTA00002922F.k.05. l.P.Seq 
RTA0000291 lF.i.05. l.P.Seq 


F 
F 
F 


MO0004376D-.A12 
M00039100A:G04 
M00026994D:D07 


CHOICOH 
CH09LNL I 
CH04M.AL J 


597 
598 
599 


10510 
13539 
20149 


RT A00002905 F.d. i 7 . 1 . P.Seq 
RTA00002S98F.t'.03. 1 P.Seq 
RTA0O002917F.O.03.1.P Seq 


F 
F 

1 F 


MOOOOS001B:F05 
M00004336A:A01 
M00032791D:FOl 


CH03MAH 
CHOICOH 
CH0SLNH 


600 


127 SO 


RT A00002S9 1 F.e.Ob. l.P.Seq 


F 


M000016S6D:F06 


CHOICOH 



WO 01/02568 
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SEQ 
ID 


CLUSTER 


SEQ NAME 




CI ONT i"D 


1 fBR \RY 


601 


182479 


RTA000029 lOF.j. 1 8. 1 P.Seq^ 


F 


M000229 / ~C : E05 


CHOjMAH 


602 


14016 


RT A0000292 3F.n.l7. l.P.Seq 


F 


M00Oj9j44C: Al 1 


CH09LNL 


603 


76075 


RTA00002890F.h.23. l.P.Seq 


F 


MOOOO 1 620B : AO j 


/— t_rn i 


604 


9806 


RTA00002922F. j.05. 1 P.Seq 


F 


M00Oj9O/8D:CIO 


/~* i_rr\nr mt 


605 


90S6 


RTA00002889F.k.l5. l.P.Seq 


F 


MOOOO 1 :> :> 3 A : E06 


/™*ijn i ff~\fj 


606 


2619 


RTA00002907F.0. 12. l.P.Seq 


F 


M00O22269C : A04 


v-HUjlVlrvn. 


607 


17517 


RTA00002907F.h.06. l.P.Seq 


F 


M00022 18r> A:B03 


run'5 \j1 a h 
v„ riu j iviMjn 


608 


5089 


RTA000029 15F.e.22.2. P.Seq 


F 


MOOOio / / / d:uU4 


punSI NTH 


609 


6728 


RTA00002904F.b. 13. l.P.Seq 


F 


M00007 1 / 8 A:C02 




610 


41 149 


RTA00002899F.2.20. l.P.Seq 


F 


M0000460^tJ:hUi 




611 


35017 


RT A00002 S92F. f.03 . 2. P.Seq 


F 


MOOOO j8 1 2C : AO J 


run i r*nH 


612 


7008 


RTA00002923F.a.08. l .P.Seq 


F 


MOOO j9 1 62 D : L 04 


PUrtQI NTT 


613 


185545 


RTA000029 12F.k. 16. 1 .P.Seq 


F 


M0002748OC :t09 


C UDziN/f A T 


614 


17840 


RTA00002892F.p. 15.2. P.Seq 


F 


MOOOOj8o4B :F07 


nu iLun 


615 


185914 


RTA000029l2F.j.24. l.P.Seq 


F 


M0002746/A:C07 


runjvi a I 


616 


6862 


RTA00002903F.b.08. l.P.Seq 


F 


M00006 87 2 D . B 07 




617 


20120 


RTA00002888F.C.24. l.P.Seq 


F 


MOOOO 1 445 B:F06 




613 


20120 


RTA000028S8F.d.0 1 . 1 .P.Seq 


F 


MOOOO 1445B:F06 




619 


13879 


RTA00002923F.d.02. 1 .P.Seq 


F 


M0003920~A:F07 


\_ HUV L £N L 


620 


9330 


RTA000029 15F.2. 16. l.P.Seq 


F 


M00028786B:A04 


L-riUoi-iNn 


621 


21572 


RTA0000292 lF.h. 19. l.P.Seq 


F 


M00033441A:B 12 


r'L/AOF MT 


622 


2943 


RTA000029l9F.h.22. l.P.Seq 


F 


M00033 144A:D02 


LrlUoLiNrl 


623 


32154 


RTA00002905F.b. 16. l.P.Seq 


F 


M00007969D:C01 


L-rlUJivlArl 


624 


20875 


RTA00002901F.k. 16. l.P.Seq 


F 


M0O00o6O^B:HOj 




625 


186324 


RTA000029 12F.d. 17. 1 .P.Seq 


F 


M00027274A:A09 


f*\i(\\\A a r 

^_riO-+ivi.-\j_ 


626 


10768 


RTA00002886F.e.0 1 . 1 .P.Seq 


F 


MOOOO 1 j46B :Cj 1 1 




627 


16711 


RTA00002935F.m. 11.1 .P.Seq 


F 


M0003322 1 C :H 1 1 


r~*c_r i 7rflHT V 


62 S 


14688 


RTA00002925F.n. 14. l.P.Seq 


F 


M0004002jo:B 1U 




629 


44419 


RTA00002907F.b. 19. 1 .P.Seq 


F 


M000221 lSA:hOo 


v^, riu_ ivx.-vn 


630 


12614 


RTA00002S96F.p.03. l .P.Seq 


F 


M0000420 1D:(_01 




631 


2165S 


RTA00002902F.C.23. 1 .P.Seq 


F 


M0O0O63 /6U:LO- 




632 


10150 


RTA0000290 1 F.i. 16. 1 .P.Seq 


F 


M00003D40A:r09 




633 


185909 


RT A000029 1 2F.C.20. 1 .P.Seq 


F 


M00027262A:AO/ 


/^Tr-in_l\.T AT 


634 


14S93 


RT A00002S90F.f.08. 1 .P.Seq 


F 


MOOOO 1 60 /D:HOy 


runirnH 


635 


32125 


RTA00002903F.C.OS. l.P.Seq 


F 


M000068S4D:AUa 




636 


11909 


RTA00002902F.a. 1 1 . 1 P.Seq 


F 




PHOTOH 


637 


17237 


RT A0000290 1 F.I. 12. 1 .P.Seq 


F 


M00003O lob .i-O/ 


rlU- v_ v_> n 


638 


11 148 


RTA00002900F. j. 1 S. 1 .P.Seq 


F 


M000U5 j4o U : AU-? 




639 


L 14837 


RTA00002925F.n.20. l.P.Seq 


F 


N 1 outwuu _ D A . a U-+ 




640 


4343 


RTA00002S97F.1. 13.1 .P.Seq 


F 


M00004_82B:UO/ 




641 


1 0£0£ 

ISoao 




F 


M00004366D:Cll 


CH01COH 


642 


10090 


RTA00002S92F.n. lO.Z.P.Seq 


F 


M00003S42D:H09 


CH01COH 


643 


612 


RTA00002SS9F.d. 13. 2. P.Seq 


F 


M00001535B-.E02 


CH01COH 


644 


10752 


RTA00002S92F.n.06.:.P.Seg_ 


F 


M00003S42D:Dll 


CH01COH 


645 


167203 


RTA000029 UF.c. 14. 1 .P.Seq 


F 


M0002S070A:H09 


CHOSLNTK 


646 


21269 


RTA0000290lF.j. 15. l.P.Seq 


F 


M00005570A:B0S 


CH02COH 


647 


186250 


RTA000029 l0F.a.2 1 . l.P.Seq^ 


F 


M00022797D:A06 


CH03MAH 


64 S 


j. 24633 


RTA00002907F.i. 19.2.P.Seg^ 


F 


M0002220SB:D03 


CH03MAH 


649 


12295 


RTA000029 lSF.c.02. 1 .P.Seq 


F 


M00032S36BA07 


CH0SLNH 


650 


7870 


RTA00002905F.b.22. l.P.Seq 


F 


MOO0O7973B:Dl 1 


CH03MAH 



WO 01/02568 



PCT/US00/18374 



ID 


CLUSTER 


i>ty iNAMb 


ORlhN I A 1 






651 


12225 


RTA00002902F.d.08. l.P.Seq 


F 


M000065 S 5 A : D07 


CH02COH 


652 


7775 


RTA00002892F.0. 12.2. P.Seq 


F 


M00003S47 A : H04 


CH01COH 


653 


14901 


RT A00002929F. t\21. l.P.Seq 


F 


M00040349D:D07 


CH14EDT 


654 


■ 6831 


RTA00002927F.b.2 1. l.P.Seq 


F 


M000394S3A:D10 


CH12EDT 


655 


10738 


RTA00002930F.b.08. 1 .P.Seq 


F 


M00042724A:G06 


CH15CON 


656 


17986 


RTA00002932F.a.20. 1 .P.Seq 


F 


M00042972CF04 


CH18CON 


657 


23163 


RTA00002895F.h.03. l.P.Seq 


F 


MO00O4O85A.H01 


CH01COH 


658 


4838 


RTAO00O2923F.i. 15. 1 .P.Seq 


F 


M000392S4D:H07 


CH09LNL 


659 


25386 


RTA00002905F.e.05. l.P.Seq 


F 


M00008007B:E03 


/""tlAlx (Air 

CH03MAH 


660 


13217 


RTA00002887F.n.0 1 . 1 .P.Seq 


F 


M00001422B:D06 


CH01COH 


661 


30656 


RT A00002906F. 1 .03 . 1 . P . Seq 


F 


M00022032A:G05 


CHOjMAH 


662 


7852 


RT A00002S89F.e. 14. 1 .P.Seq 


F 


M00001538B:A07 


CH01COH 


663 


13217 


RTA000O2S87F.m.24. l.P.Seq 


F 


M00001422B:D06 


CH01COH 


664 


15152 


RTA00002925F.f.24. l.P.Seq 


F • 


M00039873B:H04 


CH09LNL 


665 


24143 


RTA00002922F.0. 18. l.P.Seq 


F 


M00039143D:C10 


CH09LNL 


666 


23872 


RTA00002892F.1. 13. l.P.Seq 


F 


M00003823B:A06 


CH01COH 


667 


13940 


RTA00002906F.2.23. l.P.Seq 


F 


M00021967D:H06 


CH03M.AH 


668 


25759 


RTA00002907F.m. 10. 1 .P.Seq 


F 


M00022249D:C0l 


CH03MAH 


669 


5761 


RT A00002924F.p.05. 1 P.Seq 


F 


M000397S6D:A10 


CH09LNL 


670 


41703 


RTA00002901F.2.23. l.P.Seq 


F 


M00005506D:E11 


CH02COH 


671 


7165 


RT A00002909F. i. 06. l.P.Seq 


F 


M00022648A:D08 


CH03MAH 


672 


41492 


RTA00002S89F.m. 18. l.P.Seq 


F 


M00001565A:H05 


CHOICOH 


673 


9331 


RTA00002906F. a. 10. l.P.Seq 


F 


M00021953B:EOS 


CHOjMAH 


674 


7961 


RTA00002S87F.2.24. 1 .P.Seq 


F 


M00001399B:B01 


CHOICOH 


675 


15367 


RTA00002S93F.n. 17. l.P.Seq 


F 


M0000395SC:H08 


CHOICOH 


676 


185628 


RT A000029 1 2F. f . 1 7. 1 .P.Seq 


F 


M00027319C:C03 


CH04MAL 


677 


7386 


RTA00002S91F.l.l4.1.P.Seq 


F 


M0000376SD:D08 


CHOICOH 


678 


67391 


RTA00002S93F.p.07. l.P.Seq 


F 


M0000396SC:G03 


CHOICOH 


679 


46380 


RTAO00O29O6F.t. 10. l.P.Seq 


F 


MO0O21933B:FO2 


CHOjM.AH 


680 


14265 


RTA00002S92F.e.05.2.P.Seq 


F 


M0000380SA:F1 1 


CHOICOH 


681 


186478 


RTAO0002912F.f.07. l.P.Seq 


F 


M000273 13C:E01 


CH04MAL 


682 


8192 


RTA000029 16F.m.07. l.P.Seq 


F 


M00032634B.D09 


CHOSLNH 


683 


13776 


RTA00002925F.1. 10. 1 .P.Seq 


F 


M00039976C:F1 1 


CH09LNL 


684 


11796 


RT AO00029 1 2F.e.02. l.P.Seq^ 


F 


M0002729 1 A.GOS 


LHU4M.-VJ- 


685 


10827 


RT A000029 19F.i. 10. 1 .P.Seq 


F 


M00033147C:B08 


CHOSLiNH 


686 


1482 


RTAO0002925F.1. 12. l.P.Seq 


F 


M00039977B:D12 


CH09L><L 


687 


30300 


RTA00002906F. t". 16. 1 .P.Seq 


F 1 


M0002 1 94 1 A : D09 


CHUjMArl 


688 


10454 


RTA00002S90F.i. 15. l.P.Seq 


F 


M00001623D:A10 


CHOICOH 


689 


16649 


RT A00002907F.I.0 1. l.P.Seq 


F 


M00022229D:E01 


CH0jM.AJ-i 


690 


7026 


RTAO0002SS7F.b. 10. l.P.Seq 


F 


M00001 jfc B:A1 1 


j — ■ r r<"\ i / — ■ t^j 
CHUICUM 


691 


5691 


K I AUUUU_ovor.n. 1 J- 1 .r .oeq 


p 
i 


M00004 1 14CD 1 1 


CHOICOH 


692 


13797 


RTAO0OO29 1 SF.i .2 1 . 1 .P.Seq 


F 


M0003291SD:B04 


CHOSLNH 


693 


5187 


RTAO0002923F.n.03. l.P.Seq 


F 


M0003933SB:F07 


CH09LNL 


694 


186115 


RTA000029 l2F.i.01. l.P.Seq 


F 


M00027376OA02 


CH04M.AL 


695 


4S26 


RTAO0002917F\2. 24. l.P.Seq 


F 


M00032729A:F10 


CHOSLNH 


696 


6733 


RTA000029 1 7F.m. 11. l.P.Seq 


F 


M0003277-iC:C04 


CHOSLNH 


697 


7604 


RTAOOOO:923F. j.05. 1 .P.Seq 


F 


M00039291D:F02 


CH09LNL 


698 


46459 


RTAO0OO2905 F. f.0 1 . 1 . P.Seq 


F 


MO0OOS02OD:F02 


CH03M.AH 


699 


23385 


RTA00002SS9F.i.23. 1 .P.Seq 


f F 


MOO0O155iD:D01 


CHOICOH 


700 


7516 


RT A00002S9 1 F.h. 1 1 . l.P.Seq 


F 


M00003749C:C0S 


CHOICOH -1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


1 L»U<3 1CJ\ 


SEQ NAME 


ORIENTATION 


CLONE ED 


LIBR.ARY 


701 


45048 


K. 1 AUUUU-V'JDr.u.U 4 *. i .r.ocq 


p 


M00021855D:F10 


CH03MAH 


702 


14845 


R 1 AUUUU— vUJrO.Ui. 1 .r.ocq 


p 


M000071O3C:C12 


CH02COH 


703 


16479 


K I AUUUU-oo /r.i. 1 j. l.r.oeq 




M00001403D:C 12 


CH01COH 


704 


186729 


K I AUUUU— y l Ir.Q. L7._. r.ocq 


p 


M0002685OB :C09 


CH04MAL 


705 


33658 


dt \ nnnAiCQAC ■ A*7 1 P Can 
K 1 AUUUU-ooOr .).U / . l .r.oeq 


p 


M000OI361B:A12 


CH01COH 


706 


186755 


ot \ aaaa">q i ~>C i IQ 1 P C^ri 
R 1 AUUUU_y 1-r.l.lo. i.r .ocq 


p 


M00027400D:H02 


CH04MAL 


707 


4262 


K 1 AUUUU_oy lr .a.U<+. l .r .icq 


p 


M00004208 A : D08 


CH01COH 


708 


14039 


DT \ AAAA1QG7C L- 0,1 IP ^*»n 
K 1 AUUUU— oV / r.K.ul.l .r.ocq 


p 


M00004276C : A08 


CH01COH 


709 


1 1948 


R I AU'JUU^.CSyjr.n.-'*'. i.r. ocq 


p 


M000041 1SC:D12 


CH01COH 


710 


14865 


R 1 AUUUU-yUor.I. l**. l. r.ocq 


p 


M00022481B:A04 


CH03MAH 


711 


10779 


r**r* \ /"vaaata i < c ; ni I O Cjh 
RTAUUUU-y 1 jr.i.U 1 . 1 .r.ocq 


p 


M00031370B:C01 


CH08LNH 


712 


7503 


R i AUUUU— yU_r.K. IO. I.r. ocq 


p 


M00006738A:F12 


CH02COH 


713 


48130 


r» t~ v r\AAA^OAOC A/1 I P 

R 1 AUUUU— 7U-ir.c.u*t. I .r.ocq 


p 


M00006595B:C10 


CH02COH 


714 


7858 


r»T* » AflfimfimC m 1 O 1 D Cart 

RTAUUUU-iyU/r.m. I— - l.r.oeq 


p 


M00022250A:B04 


CH03MAH 


715 


4682 


RTAUUUU^y_4r.n.U.i. l.r.oeq 


p 


M00039710B:E01 


CH09LNL 


716 


20650 


RTAOUUU-iooor .p. 1U. 1 .r.ocq 


p 


MOOOO15O3BH10 


CH01COH 


717 


25320 


RTAOUUU^y lUr.e. iy. l.r.oeq 


p 


MOOO' 7 '' 857 B • A09 


CH03MAH 


718 


4924 


RTA000029 jOr .g.u 1 .i.r.beq 


p 

r 




CH15CON 


719 


21170 


m-> » A IT 1 1 1 1 D Can 

RTA0UOU2yUUr.l. I j. l.r.oeq 


p 


M00005365 -VF05 


CH02COH 


720 


9258 


RTA00002S90r.n. 17. l.r.beq 


p 


Mnnoo 1 fi 1 8C • DO 1 


CH01COH 


721 


14039 


r* T" • n AAA*1 O fV"7 C I 1.1 ID Can 

RTA00002S97r .j.24. l.f.seq 


r 


Mnnnn4' , 7fiC AOS 


CH01COH 


722 


3483 


RT A00002S99f- .D.U / . 1 .r.oeq 


p 


Mnf)0fW430 A.' A05 


CH01COH 


723 


3877 


« t* » ^nrtAn emeu m i d 
RTAOOUU-iay /r.tc. lu. l.r.oeq 


p 


M000042 7 8 A : G06 


CH01COH 


724 


7483 


RTA0000292jr.t. Iy. l.r.oeq 


p 


MOflO 39'>46B • AOS 


CH09LNL 


725 


99750 


OT" \ AAAATOAAC 1 1 1 1 P Carl 

RTAUUUU~yuUr .1. 1 / . I .r.ocq 


p 


M00005366D:F0S 


CH02COH 


726 


46459 


RTAOUUU^yUDr.e.-:^. i.r.oeq 


p 


M0000SO20D:F02 


CH03MAH 


727 


3591 


K 1 AUUUU— ISoOr.O.u/- 1 .r.ocq 


p 


M00001378C:E10 


CH01COH 


728 


1 1277 


r>*T* \ AAAATGO"2C ft 1 1 IP Q*»n 

R I AUUUU_yz jr. 2. 1 1 . i .r.ocq 


p 


M00039251D:B0S 


CH09LNL 


729 


10292 


R rAUUUU-oyor.n. 1 1 . 1 .r.ocq 


p 


M000043 9 3 C : D 06 


CH01COH 


730 


2321 1 


OT \ AAAATAOTC L- 1 *i 1 P C^»n 

R 1 AUUUU-y_-r .K. l^. I. r.oeq 


p 


MO00391O5D:A0S 


CH09LNL 


731 


185698 


OT* \ rvAAATO 1 1 C A HI ") P 

RTAUUUU_yi I r.a.UJ.-i.r.oeq 


p 


M00026836B:H03 


CH04MAL 


732 


24702 


RTAOUUU-ovar .1. 1 /. i.r. ocq 


p 


M00004360C:D09 


CH01COH 


733 


12595 


R l AUUUU-yu-4-r .c.UD. I .r.ocq 


p 


M00007197B:B05 


CH02COH 


734 


177444 


r»T* \ AAAATO 1 AC /-> AA 1 P 

R L AUUUU_y lUr .O.uo. i .r .ocq 


p 


M00023097D:BOS 


CH03MAH 


735 


38147 


n t* \ aaaAo Q c act « I 1 ID Cja 
RT AUUUU- ooor .0- 1 1 • l .r.ocq 


p 


M00001379A:F09 


CH01COH 


736 


17909 


DT \ AAAAOOAQC ri AO 1 P ^f»n 

R 1 AUUUUxyUv5r.a.UV. l.r.ocq 


p 


M00022386D:F10 


CH03MAH 


737 


13399 


DT \ AAAA^QAAP n (V7 1 P ^i»n 

K L AUUUU_yuur .n.u / . l .r.ocq 


p 


M00005385A:B 12 


CH02COH 


738 


17720 


DT \AAAA"*OflSP cr 11 1 P ^r»fl 

K 1 AUUUU— VUjr .i£ i .r.ocq 


F 


M00008065D:A07 


CH03MAH 


739 


45974 


R 1 AUUUU— VUVr .l.uO. 1 .r.ocq 


F 


M00022681D:E10 


CH03MAH 


740 


10779 


DT \ AAAlT^O 1 \C h Oil 1 P ^i»n 

R 1 AUUUU- y 1 3r. n. —-+._. r.ocq 


p 


M00031370B:C01 


CH08LNH 




t CK 


RTA0000^914F a 14.1. P. Seq 


F 


M00028055B:G07 


CH0SLNH 


742 


1712 


RTA000029 l5F.j.07. 1 .P.Seq 


F 


M00031484A:D03 


CH0SLNH 


743 


185726 


RT A000029 1 2F.a.2 1 . 1 P.Seq 


F 


M00027215B:B12 


CH04MAL 


744 


150298 


RT A00002907F.d.20. 1 .P.Seq 


F 


M00022140D:A07 


CH03MAH 


745 


358 . 


RT A00002S98F.i.02. 1 .P.Se<L 


F 


M00004358B:G02 


CH01COH 


746 


42920 


RTA00002900F.L 16.1. P.Seq 


F 


M00005309B:.A1 1 


CH02COH 


747 


25681 


RTA0OOO:S94F.k. 1 8. 1 P.Seq 


F 


M0000403SA:A04 


CH01COH 


748 


18005 


RT A00002903F.C. 1 8. l.P.Seq^ 


F 


M00006890C:F10 


CH02COH 


749 


16143 


RTA00002S92F.m.04.2. P.Seq 


F 


M00003S39C:H10 


CH01COH 


750 


9306 


RT A000O2902F.d.09. 1 .P.Seq 


F 


M000065S5.A:F09 


CH02COH 



WO 01/02568 



PCT7US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBR.ARY 


fj 1 




RTA0000290 IF.i. 1 3. 1 P. Sea 


F 


M00005535B:B01 


CH02COH 


/ J- 


oy i j 


RTAD00CP901F i 07 1 P Sea 


F 


M00005557D:H10 


CH02COH 


fJJ 


is jo iy 


RTAOOOO^PF a °0 1 P Sea 


F 


M00027215A:F06 


CH04MAL 


TZA 


iUj jy 


RTAnOOO">898F o P 1 P Sea 


F 


M00004406A:G09 


CH01COH 




5 / 4U 


RTAf)fWP9''3F o 11 1 P Sea 


F 


M000393S3A:H07 


CH09LNL 


/JO 


10U25 / 


RTAf)f)00'>907F IP 1 ? Sea 


F 


M00022237C:E04 


CH03MAH 


/ J / 


OU to 


RTAnnnn^Q lOF c 11 1 PSea 


F 


M00055433D:G03 


CH15CON 


/JO 




PTAOfWPQ/^F b 14 1 P Sea 


F 


M00039377B:E05 


CH12EDT 




OAfiA 

yooo 


RTAnnoo">9j0F f 19 1 P Sea 


F 


M00055794A:E10 


CH15CON 


/ou 


j joy 


RTAOOOCP930F b P 1 P Sea 


F 


M00042732B:H06 


CH15CON 


7 A 1 
/O I 


AQQ 1 


RT AOOOCPS95F i 03.1. P. Sea 


F 


M000040S7C:E02 


CH01COH 


7A"> 
/02 


1 "5 AAA 


rtaOOOO'WF i 05 1 P Sea 


F 


M00003822C:A09 


CH01COH 


7A7. 


AQOS 


RTAOOOCP930F k ">4 1 P Seq 


F 


M0005645SC:E01 


CH15CON 


/o4 


I 1 1*^ t 
IUjI 


PTAnnnn^QniF a 15 i p Sea 


F 


M00005504D:F06 


CH02COH 


765 




PTAnnon^SSQF 1 \ 1 P Sea 


F 


M00001512D:F08 


CH01COH 


766 


i5yo 


n-r Annnn^Q^^F m 18 L P Sea 


F 


M00039125D:H12 


CH09LNL 


767 


loo^iy 


ot Annnn^0''4F 0? 1 p Sea 


F 


M0003941 1D:D09 


CH09LNL 


768 


24429 




F 


M000069S9B:G05 


CH02COH 


769 


3 J /95 


ht Annnn^Qn^F v is 1 PSea 


F 


M00006739B:A04 


CH02COH 


770 


i ,iii;7 
24267 


ox AnnnfP°*°,QF l 17 1 P Sea 


F 


M00001561D:H04 


CH01COH 


771 


lieu: 

125J0 


DTAfinrMT'SO 1 F i l P Sea 


F 


M00003760C:G10 


CH01COH 


••"IT"! 

772 


2262/ 


PTAfttinn*>SS7F k 07 I P Sea 


F 


M00001410A:G10 


CH01COH 


773 


24430 


DTAfinnnioniPh ID 1 PSea 
xx 1 AUvvv- ixr.n.— v. 1 -r .Jc^ 


F 


M00005520B:E01 


CH02COH 


774 


1615 1 


ot Annnn''9Q7F 1 n I P Sea 


F 


M000042S4A:F08 


CH01COH 


"7*7 C 

775 


6148 


PT Annno^^QOF i !6 1 P Sea 


F 


M00001623D:E12 


CH01COH 


776 


l(JoUo4 


pt Anoni"PQf)°,F 1 19 1 P Sea 


F 


M000224S5B:E07 


CH03MAH 


■7*7*7 

77 7 


yj / J 


PT Afinnn^SQ'iF n 13 1 PSea 

IX X aUUUU- Q7JI . JJ- l -J 1 • 1 • l • 


F 


M00003970D:H07 


CH01COH 


*7"70 

778 


iy^42 


PTAnnnO'^Qn'/F P0 1 P Sea 


F 


M00006756B:G06 


CH02COH 


77V 


1 AA71 
loo /2 


pt Annnn^^SQF h ^ l 1 P Sea 


F 


MC00015:SC:C03 


CH01COH 


/oU 


Q -n7*5 
oj/J 


RTAnf)00' l S91F 0 07 1 P Sea 


F 


M000037S5D:F07 


CH01COH 


"TO 1 

78 1 


1 1 A A 

1 j /40 


RTAn00f)" , .S9fiF h 10 1 P Sea 

XX 1 nUwv-07Ul .11. lu. 1 .1 . jv,^ 


F 


M00004163C:A03 


CH01COH 


/o2 


45 UU 


RTAn000' , SS7F b 08 1 P Sea 


F 


M000013S~A:C12 


CH01COH 


TOT 
78J 


IoUUj 


RTAfinno^Q I OF c OS I P Sea 


F 


M00022S20A:F07 


CH03MAH 


-JO/I 

784 


1 Q7T2 

18 / 2j 


RT A00nn' , 9 1 6F 0 18 1 PSea 


F 


M000325SOD:A09 


CH08LNH 


785 


/IOTA 

42 /U 


PT AnnnO°0'''?F b 0t 1 P Sea 


F 


M00038616C:C09 


CH09LNL 


7Q A 
/SO 


juuyj 


RTAf)0fHV9O7F i ''O 1 P Sea 


F 


M0002220SC:E04 


CH03MAH 


/o / 


42y 10 


RTA0000' , 9?4F c 08 1 P Sea 


F 


M0003943?B:D06 


CH09LNL 


"TOO 

788 


1 joj2 


nTAO0nn"'9O''F i 09 1 P Sea 


F 


M00006714C:D06 


CH02COH 




oy / z 


RT AO0OO" , 90' , F i 06 1 P Sea 


F 


M00006712C:H01 


CH02COH 


/yu 


*o iy 


RTAO0OO"'910F i 06 1 P Sea 


F 


M0002294"B:D02 


CH03MAH 






RTA00002928F.t".09. l.P.Seq 


F 


. M0004022-C:F06 


CH13EDT 


792 


9S1S6 


RTA00002909F.rn.08. l.P.Seq 


F 


M00022696B:Cll 


CH03M.AH 


793 


3167 


RTA000O:S9SF.°.O9. l.P.Seq 


F 


M0000434-:D:C12 


CH01COH 


794 


3272 


RTA00002897F.a. IS. l.P.Seq 


F 


M00004212D:C03 


CH01COH 


795 


14446 


RTA00002S99F.d.05. l.P.Seq 


F 


M00004462D:D12 


CH01COH 


796 


17865 


RTA000029 1 SF.a. 13.1 .P.Seq^ 


F 


M00032S25B:FOS 


CH08LNH 


797 


5834 


RTA0000:S9SF.h. 12. l.P.Seq 


F 


M0000435IA:D08 


CH01COH 


798 


14533 


RTAO0OO:S96F.k.24. l.P.Seq 


F 


M0000417?C:B06 


CH01COH 


799 


15222 


~RTA0000:900F.j.05. l.P.Seq 


F 


M00005332A-.C06 


CH02COH 


800 


22594 


RTA0000:S9SF.h.2 1 .l.P.Seq 


F 


M0000435"3:B06 


CH01COH 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


JRIEN 1 A 1 


CI ON'F m 




LIBRARY 

i - AJLJ L\» UN 1 


801 


9204 


RTA00002890F.h.20. l.P.Seq 


F 


LVlUUUO 1 0 1 vL . MUy 


run i rnu 


802 


186464 


RTA0000291 lF.d.09.2.P.Seq 


F 


M00U2O3-1- U .l_U.d 


rlUHlVi r\J— 


803 


5441 


RT A00002900F.a. 1 1.1. P.Seq 


F 


M000048-4D:H05 


l_ riU— v-U ri 


804 


32544 


RTA00002893F.1.2 1. 1 .P.Seq 


F 


MOOOO j9 j3 B : BU 1 


run l CC\U 


805 


15351 


RTA000029 l5F.j. 15. l.P.Seq 


F 


M000324/ ID: AOs 


puncr VT-i 
L.HU<SL-;Nrt 


806 


13129 


RTA00002898F.a. 12. 1 .P.Seq 


F 


M000043 10B:E02 




807 


186376 


RTA00002912F.k.21. l.P.Seq 


F 


M00027485C:F07 




808 


17816 


RT A0000290 1 F.o.04. 1 .P.Seq 


F 


MOOOOa 674C : r 04 




809 


8434 


RTA00002923F.1.22. l.P.Seq 


F 


M00039j26C:BO8 




8 10 


T"> t 1 £. 

LI 14o 


RTAfir)00'>9''' , F i 08 l.P Sea 


F 


M00039067B:F07 


CH09LNL 


811 


31912 


RTA00002904F.a. 14. l.P.Seq 


F 


M00007154A:E06 


CH02COH 


812 


148 / 


BTAnnfiO'"?'?5F n 03 1 P Sea 


F 


M00040016C:E07 


CH09LNL 


813 


24777 . 


RTA00002900F.n.02. l.P.Seq 


F 


M00005380BH10 


CH02COH 


814 


144483 


RTA00002902F.d.01 . 1 .P.Seq 


F 


M00006577A:HIO 


CH02COH 


815 


0340 


PT AflfinO''935F d 16 1- P Sea 


F 


M00055425C:A04 


CH17COHLV 


816 


5984 


RTA00002935F.p.09. 1 .P.Seq 


F 


M00055420A:E06 


CH17COHLV 


817 


24441 


RT A0O0O290OF.a.22. 1 .P.Seq 


F 


M00004832D:G04 


CH02COH 


818 


20889 


RTA00002935F.h. 09. l.P.Seq 


F 
F 


M00054S07D.C11 
M00028763A:Gll 


CH17COHLV 
CHOSLNH 


819 
820 


127721 
20684 


or AniWPQ l SF c IS 1 P Sea 
RTA00002900F.C.03. 1 .P.Seq 


F 
F 


M00004843A:G12 
M0002220SC:E04 


CH02COH 
CH03MAH 


821 
822 
823 


30095 
6763 
6763 


RTA00002907F.i.20.2.P.Seq 
RTA00002892F.o.01.2.P.Seq 
RTA00002892F.n.24.2.P.Seq 


F 
F 
F 


M00003845D:G03 
M00003845D:G03 
M00022240B:C12 


CH01COH 

pun i pHH 

CH03MAH 


824 
825 
82o 
827 


48725 
21260 

/ ' 

3441 


RTA00002907F.l.'>2.2. P.Seq 
RTA00002935F.C.22. 1 .P.Seq 
RTA(Y)00''930F c 9 1. l.P.Seq 
RTA00002935F.i. 13. l.P.Seq 


F 
F 
F 
F 


M00054499A:C0S 
M00055454A:D02 
M00054S90C:D05 
M00042734A:F05 


CH17COHLV 
CH15CON 

CH17COHLV 
CH15CON 


828 
829 
830 
831 


21419 

oUU4 

185870 

i .1 cun 
Z430U 


RT A00002930F.b. 13.1 .P.Seq 
rt AnnofPQ 10F b 08 1 P Sea 
RTA00002912F.C.06. l.P.Seq 
DT A<W)<Y>Q3f)F d 01 1 P Sea 


F 
F 
F 


M00022S053.A10 
M00027247CD02 
M00055466A:F06 


CH03MAH 
CH04MAL 
CH15CON 


832 
833 
834 


5153 
8653 
23799 


RT A00002930F.b. 16. 1 .P.Seq 
RTA00002895F.f. 17. l.P.Seq 
RTA00002924F.1.23. l.P.Seq 


F 
F 
F 
F 


M00042743D:G10 
M000040SOC:C04 
M0003969 oL . B U-? 
M00056215D:F02 


CH15CON 
CH01COH 
nwnof VI 

\w. n\j / Li' l. 

CH15CON 


835 
836 
837 


11012 
46592 
6650 


RT A00002930F. j .09 . 1 . P.Seq 
RTA00002900F.b. 19.1 .P.Seq 
RTA0000290SF.m. 12. l.P.Seq 


F 
F 
F 


M00004S39B:C12 
M00022491D:AlO 
M0000156SC:A03 


CH02COH 
CH03MAH 
CH01COH 


838 
839 


16618 
18274 


RTA00002889F.n. 18. l.P.Seq 
RTA00002889F.2.05. l.P.Seq 


F 
F 


M00001543C:AOS 
M00022442B:G03 


CH01COH 
CH03MAH 


840 
. 841 
842 


20694 
9493 
6132 


RTA0000290SF.h.08. l.P.Seq 
RTA00002909F.m. 11. l.P.Seq 


F 
F 


M0002269SC:DtO 
M00004220D:C 1 1 


CH03M.AH 
CH01COH 


843 
844 
845 
846 
847 
848 
849 
850 


186259 
3769 
36584 
38077 
3927 
4275 
12554 
13761 


RTA00002912F.m. 13. l.P.Seq 
RTA000029 1 6F.2.22. 1 .P.Seq 
RTA00002935F.f. 12. l.P.Seq 
RT A00002S90F.e.06. 1 .P.Seq 
RTA00002935F.a.l2. l.P.Seq 
RTA000029 UF.b. 16. 1 .P.Seq_ 
RTA0000292 lF.a.23. 1 .P.Seq 
RTA00002901F.f.22. l.P.Seq 


F 
F 
F 
F 
F 

1 F 

F 
F 


M000275:7B:C05 
M000325S1B:A09 
M000546S3D-.GU 
M00001605B:B05 
M0004251bB:D01 
M0002SOt>3C:H01 
M00033302A:El I 
M000054S9B:COS 


CH04MAL 
CHOSLNH 

CH17COHLV 
CH01COH 

CH 17COHLV 
CHOSLNH 
CH09LNL 
CH02COH 
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SEQ 
ID 


CLUSTER 1 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


851 


19059 


RT A00002897F.C.22. 1 .P.Seq 


F 


M00004237C:D10 


CHOICOH 


852 


22944 


RTA00002935F.b. 17. l.P.Seq 


F 


M0O043355A:D07 


CH17C0HLV 


853 


2189 


RT A00002925F. j .06. 1 .P.Seq 


F i 


M00039921A:B10 


CH09LNL 


854 


19153 


RTA00002S92F.h.04.2.P.Seq 


F 
F 


M00003819B:B01 
M00001606B:A10 


CHOICOH 
CHOICOH 


855 
856 
857 
858 
859 
860 
361 
862 
863 


1833 

18447 

2461 

15917 

9379 

5511 

10540 

12117 

8777 . 


RTA00002890F.e. 13. l.P.Seq 
RTA00002935F.d.23. l.P.Seq 
RTA00002922F.b.08. l.P.Seq 
RTA00002896F.i.06. l.P.Seq 
RTA00002935F.a. 15. l.P.Seq 
RTA000O2931F.b.06. l.P.Seq 
RTA0O0O289lF.k.l6.1.P.Seq 
RTA00002899F.a. 09. l.P.Seq 
RTA000O29 l9F.a.23. l.P.Seq 


F 
F 
F 
F 
F 
F 
F 
F 
F 


M00054569A:B07 
M000386I9B:F09 
M00004172C:A08 
M00043299A:B10 
M00042796A:A10 
M00003764B:H11 
M00004419A:G02 
M00033028D:C10 
M00005403C:A01 


CH17COHLV 
CH09LNL 
CHOICOH 

CH17C0HLV 
CH16C0P 
CHOICOH 
CHOICOH 
CH08LNH 
CH02COH 


864 
865 
866 
867 
868 
869 
870 
871 
872 
873 
874 
875 
876 
877 
878 
879 


23972 
17005 
1085 
4270 
4609 
6889 
15228 
20971 
5174 
15236 
9223 
24591 
36306 
3309 
186712 
9090 


RTA00002900F.0. 18. l.P.Seq 
RTA00002896F.m. 10.1. P.Seq 
RTA00002924F.1.20. l.P.Seq 
RTA00002922F.a.24. l.P.Seq 
RTA00002935F.e. 15. l.P.Seq 
RT A000029 19F.C.07. l.P.Seq 
RTA000029 19F.e.06. l.P.Seq 
RTA00002904F.a.22. l.P.Seq 
RTA00002935F.a.23. l.P.Seq 
RT A000029 2 8 F. e . 1 6. 1 .P.Seq 
RTA00002S96F.b. 15. 1 -P.Seq 
RTA00002923F.2.10.1.P.Seq 
RTA00002S88F.1.1 1. l.P.Seq 
RTA00002893F.j.21. l.P.Seq 
RTA000029 1 IF.c. 1 1 .2.P.Seq 
RTA00002S91F.S.23. l.P.Seq 


F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 


M00004187B:C02 
M00039694C:H01 
M000386I6C:C09 
M00054599D:B03 
M00033037B:F04 
M00033055D:D02 
M00007153D:D03 
M00043313D:E09 
M00O4019SA:F12 
M00004141 A:D01 
M00O39251CH12 
M000014S5C:F06 
M00003916A.E04 
M00026S09A:H08 
M00003746C:Ell 
M00001467OD04 


CHOICOH 
CH09LNL 
CH09LNL 

CH17C0HLV 
CH08LNH 
CHOSLN'H 
CH02COH 

CH17C0HLV 
CH13EDT 
CHOICOH 
CH09LNL 
CH01COH 
CHOICOH 
CH04MAL 
CH01COH 
CHOICOH 


880 
881 
882 
883 
884 
885 
886 
887 
888 
889 
890 


11510 
9784 
25618 
12493 
24361 
12449 
17894 
13204 
32119 
5909 
24453 


RTA00O02888F.i.O7. l.P.Seq 
RTA000028s9r.|.13. l.r.oeq 
RTA00002930F.a. 1 1. l.P.Seq 
RTA00002928F.i. 11. l.P.Seq 
RTA00002933F.b.08. 1 .P.Seq 
RT A00002930F.d.02. 1 .P.Seq 
RTA00002929F.a.04. 1 .P.Seq 
RTA00002930F.f.09.l. P.Seq 
RTA00002930F.C. 19. 1 .P.Seq 
RTA00002935F.I.23. l.P.Seq 
RTA00002927F.d. 15. l.P.Seq 


F 
r 

F 

f F 
F 
F 
F 
F 
F 
F 
F 


M00001554C:G10 
M00042554A:D01 
M000402S9D:C06 
M00043134A:F05 
M00055468A:A08 
M00039747B:B06 
M00055745B:A08 
MO0O55448B:E05 
MO0054931D:E10 
MO0O39526A:AO8 


CHOICOH 
CH15C0N 
CH13EDT 
CHI9C0P 
CH15C0N 
CH14EDT 
CH15C0N 
CH15C0N' 
CH17C0HLV 
CHI2EDT 


891 
892 
893 
894 
895 
896 
897 
898 
899 
900 


46982 
43888 
24580 
186495 
12420 
3833 
10438 
12367 
5012 
6458 


dt AfinnrPQ^SF k *> n 1 P Sea 
RTA00002932F.b.23. 1 .P.Seq 
RT A00002930F.C.24. 1 .P.Seq 
RT A00002927F.a.2 1 . 1 .P.Seq 
RTA00002932F.b.2 1. 1 .P.Seq 
RTA000029 16F.e. 14. 1. P.Seq 
RT A00002930F. j . 1 3 . 1 . P.Seq 
RTA00002922F.n. 10. l.P.Seq 
RT A00002930F.k. 16. 1 .P.Seq 
RT A00002929F.C.2 1 . 1 .P.Seq 


F 
F 
F 
F 
F 
F 
F 
F 
F 
F 


M00055093B:A03 
M000430~0A:C03 
M000554o6A:FO6 
M000393o4D:E05 
M000430o3C:H05 
M0O032562C:FOl 
MO0O5623OD:E07 
M00039I33B:D06 
M0005o342A:C03 
M 00040291 A:G10 


CH17C0HLV 
CH1SC0N 
CH15C0N_ 
CH12EDT 
CH18C0N 
CHOSLN'H 
CH15C0N 
CH09LNL _ 
CH15C0N 
CHUEDT "1 



WO 01/02568 



PCT/USO0/18374 




\ 1/) 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


DRIENTATION 


CLONE ID 


LIBRARY 


951 


11994 


RTA00002890r.n. I4.i.p.s>ea 


p 


M0O001617C:Fl0 


CH01COH 


952 


186664 


RT A00002932F.a.05. 1 .P.Sea 


F 


M00042585D:D03 


CHI SCON 


953 


162235 


RT A0000-9U7 r . | .Uo. _ . r .ieq 


p 

r 


M000' r '''l' , D-G02 


CH03MAH 


954 


2127 


RTA000029 12F.0. 14. 1 .P.Seq 


F 


M00027605B:D09 


CH04MAL 


955 


41014 


RTA00002901F.n.04.1. P.Seq 


F 


M00005641B:E09 


CH02COH 


956 


17636 


RTA00002933F.C. 19. 1 .P.Seq 


F 


M00043222C:B06 


CH19COP 


957 


2328 


RTA00002935F.e.05. 1 .P.Seq 


F 


M00054579A:C02 


CH17COHLV 


958 


15414 


RTA00002935F.D. 13.1. P.Seq 


F 


M00055423C:H10 


CH17COHLV 


959 


11948 


RT A00002S95F.O.0 1 . 1 .P.Seq 


F 

p 


M00004118C:D12 
M00007082D.E05 


CHOICOH 
CH02COH 


960 
961 


24759 
15152 


RTA00002903F.n. 05. l.P.Seq 
RT A000029'>5F.z.0 1 . 1 .P.Seq 


r 

F 

F 


M00039873B:H04 
M00038616D:B07 


CH09LNL 
CH09LNL 


962 
963 
964 


14917 
12941 
29676 


RTA00002922F.b.02. 1 .P.Seq 
RTA00002889F.C. 15.1. P.Seq 
RTA0000293 lF.b.03. 1 .P.Seq 


F 
F 


M00001532A:G08 
M0004278SA:F04 


CHOICOH 
CH16COP 


965 
966 


17789 
45097 


RT A00002 89 1 F. a.2 1 . 1-P Seq 
RTA00002928F.g.06. l.P.Seq 


p 

r 
F 


M0000167 1 A:H10 
M00040247D:D02 


CHOICOH 
CH13EDT 


967 
968 


18407 
22309 


RTA00002909F.b. 1 1.1. P.Seq 
RTA00002900F.n. 19. l.P.Seq 


F 
F 
F 


M00022546B:E05 
M00005392A-.G06 
M00022224A;G07 


CH03MAH 
CH02COH 
CH03MAH 


969 
970 
971 
972 
973 


109382 
92273 
8403 
7763 
13470 


RTA00002907F.k. 13. l.P.Seq 
RT A00002909F.j. 17. 1 .P.Seq 
RTA00002915F.j.22. l.P.Seq 
RTA0000292SF.h. 10. l.P.Seq 
RTA00002930F.k.09. l.P.Seq 


F 
F 

F 

tr 
r 

F 


M00022662D:H03 
M00032474A:G03 
M00040267D:A12 
MnflO56j04 AH05 
M00033556D:C10 


CH03MAH 
CH08LNH 
CH13EDT 
CH15CON 
CH09LNL 


974 
975 
976 
977 
978 
979 
980 
981 


1484 
10345 
17242 
171180 
16790 
139516 
4825 
8830 


RT A0000292 IF.k. 10. 1 .P.Sea 
RTA00002S92F.o.l9.2.P.Seq 
RTA0000293 lF.a.05. 1 .P.Seq 
RTA00002909F.f.24. l.P.Seq 
RTA000029 14F.C.03. 1 .P.Seq 
RTAG0002903F.1.02. l.P.Seq 
RTA00002930F.b.l5. l.P.Seq 
RT A00002930F.a.05. 1 .P.Seq 


F 
F 
F 
F 
F 
F 
F 
F 


M00003848C:G09 
M00042433A:E11 
M00022618B:D09 
M00028067A:C11 
M00007032C:A12 
M00042742B:E04 
M00042528C:H01 
M00055391B:C07 


CHOICOH 
CH16COP 
CH03MAH 
CH08LNH 
CH02COH 
CH15CON 
CH15CON 
CH17COHLV 


982 
983 
984 
985 
986 
987 
988 
989 
990 
991 
992 


12398 
17867 
15796 
185669 
13638 
8280 
12632 
7620 
23922 
43864 
34478 


RT A00002935F.0. 19. 1 .P.Seq 
RTA00002900F.C. U. 1 .P.Seq 
RTA000029.oF.b. 12. l.P.Seq 
RT A00002935F. f. 1 3. 1 .P.Seq 
RTA00002935F.i.20. l.P.Seq 
RT AO0002930F.e. 1 2. 1 .P.Seq 
RTA00002931F.C.03: l.P.Seq 
RTA00002935F.m. IS. 1 .P.Seq 
RTA00002935F.m.20. l.P.Seq 
RTA0000293 lF.b.05. 1 .P.Seq 
RTA00002929F.2. 13. l.P.Seq 


F 
p 

r 
F 
F 
F 
F 
F 
F 
F 
r 
F 


MO0004850A:B02 
M000433~9CF1 1 
M000546S6A:A09 
M00055002B:EOS 
M00055653C.B07 
M00042S60B:C07 
M0005524OA:A08 
M00055244B-.F07 
M00042794A:F01 
MOOf)40367 A COS 
M00043221D:C12 


CH02COH 
CH17COHLV 
CH17COHLV 
CH17COHLV 
CH15CON 
CH16COP 
CH17COHLV 
CH17COHLV 
CH16COP 
CH14EDT 
CH19COP 


993 
994 
995 
996 
997 
998 
999 
100C 


6861 

13971 

13971 

13244 

7455 

18915 

4023 

10785 


RTA0O002933F.C. 171 .P.Seq 
RT AOO0O2933F.b.0 1 . 1 .P.Seq 
RTA0000:933F.a.24. l.P.Seq 
RTA0000:927F.e.0S. l.P.Seq 
RTA00002935F.d. 1 1 . 1 .P.Seq 
RTA0000:9:9F.b.2 1 . 1 .P.Seq 
RTAOOOO:935F.h.03. l.P.Seq 
RTAOO0O2933F.a. 11.1 .P.Seq 


F 
F 
F 
F 
F 
F 
F 


r M00043099A:H04 
1 M00043099A:H04 
M00039537A:FOS 
"~ M000545:SB:E05 
M00040201A:H01 
M000547S6C:D03 
M00043074C:D07 


CH19COP 
CH19COP 
CH12EDT 

CH17COHLV 
CH14EDT 

CH17COHLV 
CH19CO"Fl 



WO 01/02568 



PCTVUS00/18374 




PCT/US00/18374 

WO 01/02568 



SEQ 
ID 



CLUSTER 



SEQ NAME 



ORIENTATION^ 



CLONE ID 



LIBRARY 



1055 
1056 



21401 



RTA00002930F.e.20.2.P.Seq 




RTA00002935F.P.0 1 . 1 -P.Seq 



M00055676A:G02 



M00055402A:H01 



CH15CON 



13060 



RT A00002934F.a. 19. 1 .P.Secj 



M00043529A:B03 



23439 
20547 



RTA00002935F.O.24. 1 .P.Seq 



_F_ 

F 



M00055402A:H01 



CH17COHLV 



105' 
1058 



RTA0000293 1 F.b. 12. 1 .P.Seq 
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RTAOO0O2908F a. 17. l.P.Seg 



RT A000O2924F.1.09. 1 .P.Seg 



RTAnOOO2935F.d.20. l.P.Seg 



RTA00002935F.r. 14. 1. P.Seg 



RTAOO0O2922F.n.l2.1.P.Seq 



RTA00002911F.f.ll.lPSe l 
RTA00002886F.1. 16. l.P.Seg 



RTA000029 19F.i. 14. 1 .P.Seg 



RT AOfiOO2908F.h. 1 1 . 1 .P.Seg 



_F_ 

F 



_F_ 
F 



RTAOO002901F.Q. 19. l.P.Seg 



RTA00002932F.b. 12. l.P.Seg 



RT A00002S98F.i. 11.1 -P-Sei 
RTA000029 15F.m.02.2.P.Seg 



RT A00002909F.i.23. 1 .P.Seg 



RTAf>OQ02907F.m.01.1-PSeq 



RTAn0002909F.1.16. l.P.Seg 



RTAnOOQ2909F.1.13.1.P-Seq 



RTA00002897F.1.09.1.PSeq 



RTA0000291SF.C.01. l.P.Seg 



RTAO0002902F.h.08. l.P.Seg 



RT Ann002902F.g.06. l.P.Seg 



RT A000029 1 6F.f .05 . 1 .P.Seg 



RTAnn002902F.a.05. l.P.Seg 



RTA00002907F.i.09.2.P.Seg 



RTA000029 1 6F.b. 19. 1 -P.Seg 



RTA00OO29 L 1 F.e.24. 1 .P.Seg 



RTA00002907F.Q. 19. l.P.Seg 



RT A000028S7F.a.09. 1 .P.Seg 



RT A000029 UF.h.23. l.P.Seg 



RTAnOO02917F.g. 15. l.P.Seg 



RTA00002923F.O.07. 1 .P.Seg 



RTA00002928F.d.02. l.P.Seg 



_F_ 
_F_ 
F 



F 

F 



RTA00002392F.t. 10.2. P.Seg 



RTA00002919F.t.l4. l.P.Seg 



RTA0000288SF.a.04. l.P.Seg 



RTA00002S97F.d.03. l.P.Seg 



RTA00002935F.k.l 1.1. P.Seg 



RTA0000291 lF.f.0 1.1. P.Seg 



RTA0000292 lF.a.24.1.P.Seg 



RTA0000290SF.k.23. l.P.Seg 



F 
F 



M000433?6D:B03 



CH17C0HLV 1 



M00043402B:G07 



M00042 350A:A05 



M00055232A:E03 



M00054693A:E11 



MO0O5632OB:A03 



M00Q42352D:B03 
M00001553C:GL1 
M00Q42449B:F05 



M00006994C:F06 



M00022363OD05 



M00Q396S6C:C0t_ 



M0005454SC:H06 



M00054686A:F10 



M00039134D:F08 



M000269L4C:H09 
M00001369A:G06 



M00022446C:H06 



M00005703D:G10 



M00043016B:F09 



M00004359A:E01 
M00032494C:H08 



M00022656D:D07 



M00022240D:B11 



M00022690A:A07 



M000226S4A:E06 



M000042SLVC04 



M00032S:-5D:G04 



M000066~SC:C02 



M00006646A:A07 



CH17C0HLV 1 



CH16C0P 



CH17C0HLV I 



CH15C0N 



C H17C0HLV | 

CH01COH 
CH17C0HLV | 



CH02COH 



CH03MAH 
CH09LNL 



CH17CQHLV ] 



CHL7COHLV I 



CH09LNL 



CH04MAL 
CHOICOH 
CH08LNH 



CH03MAH 



CH02CQH 



CHI SCON. 
CHOICOH 
CH08LNH 
CH03MAH 



CH03MAH 



CH03MAH 
CH03MAH 



CHOICOH 
CH08LNH 



CH02COH 



CH02COH 



M0003256"B:G05 



M0OOO5~63D:AOl 



M00022200B:B05 



M00032541OG03 



M00026900A:H07 
M000222T3A:E03 



M0O0O13S5A:E07 



M0002S212D:C05 
M000327:TA:E04_ 



M000393S1C-.C07 



M00040169A:G06 



M00003S UA:G05 
MOOQ330T:a:A09 



M00001433B:E02_ 



M000042Z5P:E03 



M00055055C:F01 



M00026900A:H07 
M000334--4D:F05 



M0Q0224~-1B:C0S 



CH08LNH 



CH02COH 



CH03MAH 



CH08LNH_ 
CH04MAL 



CH03MAH 



CHOICOH 



CHOSLNH 



CH08LNH 



CH09LNL 
CH13EDT 



CHOSLNH, 
CHOICOH 



CHOICOH 



CH17C0HLV 
CH04MAL 



CH09LNL 



CH03MAH 



'35 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1351 


47793 


RTA00002930F.a.03.2.P.Seq 


F 


M00055S09A:B09 


CH15CON 


1352 


7695 


RTA00002S92F.n. l8.2.P.Seq 


F 


M00003845A:C07 


CH01COH 


1353 


16997 


RTA00002922F.k. 15. l.P.Seq 


F 


M0003910 _ A:E12 


CH09LNLt 


1354 


25441 


RTA00002906F.i.08. 1 .P.Seq 


F 


M000219S1A:C02 


CH03MAH 


1355 


4303 


RTA00002S97F.O.20. 1 .P.Seq 


F 


M00004295D:C07 


CH01COH 


1356 


5741 


RTA00002887F.C. 19. 1 .P.Seq 


F 


M00001390D:E02 


CH01COH 


1357 


17264 


RT A00002900F.a. 1 8. 1 .P.Seq 


F 


M00004831C:GU 


CH02COH 


1358 


11766 


RTA00002925F.f.20. 1. P.Seq 


F 


M000398"1C:G05 


CH09LNL 


1359 


13618 


RTA00002893F.o.l5.l.P.Seq 


F 


M00003963D:F01 


CH01COH 


1360 


13903 


RTA00002923F.C. 18. 1. P.Seq 


F 


M00039204A:E09 


CH09LNL 


1361 


10673 


RTA00002927F.h.23. l.P.Seq 


F 


M00039646A:E06 


CH12EDT 


1362 


17412 


RT A00002932F.b. 1 1 . 1 .P.Seq 


F 


M00043015D:D05 


CH18C0N 


1363 


2218 


RTA000029 19F.a.20. 1 .P.Seq 


F 


M0003302SCA02 


CH08LNH 


1364 


5858 


RTA00002923F.i.0 1 . 1 .P.Seq 


F 


M00039275B:E02 


CH09LNL 


1365 


2510 


RTA00002S98F.b. 14.1 P.Seq 


F 


M00004316A:B03 


CH01COH 


1366 


8050 


RTA00002900F.n.04. l.P.Seq 


F 


M000053S3A:Cll 


CH02COH 


1367 


186538 


RTA00002929F.e. 1 8. 1 P.Seq 


F 


M00040329A:H05 


CH14EDT 


1368 


25427 


RTA00002935F.n.20. l.P.Seq 


F j 


M0005533"B:C04 


CH17C0HLV 


1369 


24098 


RTA0000290 1 Fa. 10. 1 .P.Seq 


F 


M00005422D:H02 


CH02COH 


1370 


123823 


RTA00002905F.h.08. l.P.Seq 


F 


M00008071D:H03 


CH03MAH 


1371 


3644 


RTA00002901F.C.03. l.P.Seq 


F 


M000054-lfD:D04 


CH02COH 


1372 


27783 


RTA00002917F.a. 17. l.P.Seq 


F 


M00032666A:C02 


CH08LNH 


1373 


1682 


RT A000029 lOF.b.03. 1 .P.Seq 


F 


M0002280iD:D09 


CH03MAH 


1374 


3200 


RTA00002887F.e.07. l.P.Seq 


F 


M0000139?C:F04 


CH01COH 


1375 


8442 


RTA000029 1 7F.h.23. 1.P.Sea 


F 


M0003273-B:E12 


CHOSLNH 


1376 


15353 


RTA00002910F.e. 11. l.P.Seq 


F 


M00022854C:G07 


CH03MAH 


1377 


6314 


RTA00002922F.b.06. l.P.Seq 


F 


M0003861SD:D08 


CH09LNL 


1378 


93549 


RTA00002909F.j. 14. l.P.Seq 


F 


M00022662C:H04 


CH03MAH 


1379 


15496 


RTA00002906F.p.03. l.P.Seq 


F 


M000220SS3:H02 


CH03MAH 


1380 


16572 


RTA00002 886F.k.03 . 1 .P.Seq 


F 


M0O0O136-A:C09 


CH01COH 


1381 


74821 


RTA00002S90F.p.2 1 .l.P.Seq 


F 


M0000166?A:A12 


CH01COH 


1382 


11315 


RTA00002S89F.d. 12. l.P.Seq 


F 


MOOOO 15353 :B 10 


CH01COH 


1383 


i 10859 


RTA00002S94F.C. 1 S. 1 .P.Seq 


F 


MOO0039S0D:C06 


CH01COH 


1384 


15391 


RTA000029 14F.f'.04. l.P.Seq 


F 


M0002819?3:E07 


CHOSLNH 


1385 


23172 


RTA00002S96F.b. 18.1 .P.Seq 


F 


M0000414!B:F08 


CH01COH 


1386 


22510 


RTA00002S86F.1.05. 1 P.Seq 


. F 


M0000136SA:C02 


CH01COH 


1387 


17156 


RTA00O02934F.a.0S. l.P.Seq 


F 


M000434553:C08 


CH20COHLV 


1388 


4593 


RTA00002S96F.O. 1 8. l.P.Seq 


F 


M0000420CC:A04 


CH01COH 


1389 


2178 


RTA0000290 1 F.m.OS . 1 .P.Seq 


F 


M0000562cD:Gll 


CH02COH 


1390 


1015 


RTA00002933F.C.11. l.P.Seq 


F 


M00043213A:D05 


CH19COP 


1391 


26792 


RTAO0O029O7F.a. IS. l.P.Seq 


F 


M0002210:-C:D05 


CHU.>iVlAri 


1392 


27830 


RTA0000292 lF.c.07. l.P.Seq 


F 


M0003334-:A:B06 


CH09LNL 


1393 


14648 


RTA0000:S98F.j. 11. l.P.Seq 


F 


M0000436fC:Gll 


CH01COH 


1394 


12585 


RTA0000:S97F.i. 20. l.P.Seq^ 


F 


M0000426^A:FU 


CH01COH 


1395 


15825 


RTA000029 16F.d. 12. l.P.Seq 


F 


M0003255.-A:A07 


CH08LNH 


1396 


7043 


RTA00002900F.h. 07. l.P.Seq 


F 


M0000501-3:F02 


CH02COH 


1397 


29354 


RTA00002905F .c. 1 3. l.P.Seq 


F 


M000079S1C:F07 


CH03MAH 


1398 


29703 


RTA00002907F.d.24. l.P.Seq 


F 


M0002214-C:E12 


CH03MAH 


1399 


6S11 


RTA000029 1 3F.b.07. 1 .P.Seg^ 


F 


M0002772O:D04 


CH04M.AL 


1400 


12657 


RTA00002906F.b.20. l.P.Seq 


F 


M0002186cC:HOS 


CH03MAH 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


r\n rcMT AT*Tf"lW 
JKLfciN 1 i\ 1 


CLONTE ID 


LIBRARY 


1401 


2033 


RTA00002922F.e.08.2. P.Seq 


F 


Muuu_>yu_-»u. h 1 _ 




1402 


24229 


RTA00002920F.b.04. 1 P.Seq 


F 


M00033329C:C02 


CHOSLNH 


1403 


20664 


RTA00002S86F.a.07. l.P.Seq 


F- 


M0000133SC:F05 


CHOICOH 


1404 


3656 


RTA000029O2F.f.20. l.P.Seq 


F I 


M00UUO04 ! tJ : rlO 




1405 


10998 


RTA0000293 1F.C.07. 1 .P.Seq 


F 


M0004287SD:G06 


CH16COP 


1406 


1150 


RTA00002922F.i.l4.t.P.Seq 


F 


M00039081B:G07 


CH09LNL 


1 A 

L4U / 




RTA0000"'900F h.06.1. P.Seq 


F 


M00005013D:H05 


CH02COH 


1408 


34505 


RTA0000290 lF.a. 16. 1 .P.Seq 


F 


M00005423C:A10 


CH02COH 


1409 


8175 


RT A00002924F. f.0 1 . 1 .P.Seq 


F 


M00039472B:E05 


CH09LNL 


1410 
1411 
1412 


8175 
19375 
10866 


RTA0O002924F.e.24. 1 .P.Seq 
RT A00002903 F.n.02.1. P.Seq 
RTA00002929F.C. 15. 1 .P.Seq 


F 
F 
F 


M00039472B:E05 
M000O7U3 l ts.cua 
M00040219B.B07 


CH09LNL 
CH14EDT 


1413 
1414 
1415 


24166 
15333 
44436 


RTA0000289 lF.k.07. 1. P.Seq 
RTA00002888F.C. 12. 1 .P.Seq 
RT A00002907F.b. 17. 1 .P.Seq 


F 
F 
F 
F 


M00003763A:B02 
M00001442C:G12 
M00022U7C:A02 
M00042560C:G06 


CHOICOH 
CHOICOH 
CH03MAH 
CH15CON 


1416 
1417 
1418 
1419 


9247 
12317 
11968 
14181 


RTA00002930F.a. 16.1. P.Seq 
RTA00002908F.2. 13. 1 -P.Seq 
RTA00002890F.i.24.l.P.Seq 
RTA00002908F.n.09.2.P.Seq 


F 
F 
F 


M00022430C:C06 
M00001625D:B04 
M00022499D.D08 


CH03MAH 
CHOICOH 
CH03MAH 


1420 
1421 
1422 
1423 
1424 
1425 
1426 
1427 
1428 
1429 
1430 
1431 
1432 
1433 
1434 
1435 


15359 
46675 
24898 
156424 
11996 
11996 
4784 
9120 
11295 
3991 
20358 
12823 
147419 
12174 
35608 
2325 


RTA00002909F.1.02.1. P.Seq 
RTA000029l6F.h.03. l.P.Seq 
RTA00002903F.k. 17. 1. P.Seq 
RTA00002905F.m.22. 1. P.Seq 
RTA00002901F.b.24.1. P.Seq 
RTA00002901F.C.01.1. P.Seq 
RTA00002894F.e.20. 1. P.Seq 
RTA00002914F.h.l0.1.P.Seq 
RTA00002890F.j.l5.1.P.Seq 
RTA00002896F.h.05.1.P.Seq 
RTA00002908F.b.06. 1 .P.Seq 
RTA0000292 lF.h.0 1 . 1 .P.Seq 
RTA00002906F.S.05. l.P.Seq 
RTA000029 l9F.f. 13. l.P.Seq 
RTA000O2897F.O.24. 1 .P.Seq 
RTA00002894F.2.07. l.P.Seq 


F 
F 
F 
F 
F 

L F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 


M00022677C:C0l 

M000325S-iA:D06 

M00007019B:E01 

M00021653A:B02 

M00005445A:E07 

M00005445A-.E07 

M000039SSD:BOl 

M0O0 J 82 1 OB : HU j 

M00001632C:A10 

M00004162D:F02 

MOO02236~D:Gtl 

M00033434D:F05 

M00021952B:G06 

MO0033O71D:E08 

M00004296B:D03 

M00003994.-VB10 


CH03MAH 
CH08LNH 
CH02COH 
CH03MAH 
CH02COH 
CH02COH 
CHOICOH 

CHOICOH 
CHOICOH 
CH03MAH 
CH09LNL 
CH03MAH 
CH08LNH 
CHOICOH 
CHOICOH 


1436 
1437 
1438 
1439 
1440 
1441 
1442 
1443 
1444 
1445 
1446 
1447 
1448 
1449 
1450 


166261 
5713 
3624 
10305 
7768 
9847 

24376 
8743 
22251 
12337 
13623 
5521 
2193 
773 


RTA00002908F.I.05. 1 .P.Seq 
RTA00002920F.a.09. l.P.Seq 
RTA000029 1 OF. 2.06. l.P.Seq 
RTA00002909F.a.07. l.P.Seq 
RTA000029 l0F.k.22. l.P.Seq 
RTA00002908F.p.07. l.P.Seq 
ot Annnn' , 8S7F o 06 l.P Sea 
RTA00002900F.b.07. l.P.Seq 
RTA00002907F.n. 19. 1 .P.Seq 
RTA00002926F.C. 10.2. P.Seq 
RT.A0000292SF.d.07. l.P.Seq 
RTA0000291 lF.d.08.2. P.Seq 
RT A00002SS7F.j.06. 1 P.Seq 
RTA00002933F.a. 13. l.P.Seq 
RTA00002SS9F.i.02. l.P.Seq 


F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 
F 


M00022475D:C07 
M00033324B:F04 
M00022901A:C05 
M00022530B:C04 
M00022992B:G12 
MOO022516B:C05 
M00001426C:F06 
M00004S3cB:C02 
M00022262A:F06 
M00040079B:F06 
M00040173D:A04 
M00026S42B:.A0l 
M00001406B:H09 
M0004307"B:Fll 
MOOOO 1 55 !D:H09 


CH03MAH 
CH08LNH 
CH03MAH 
CH03MAH 
CH03MAH 
CH03MAH 
CHOICOH 
CH02COH 
CH03MAH 
CH09LNL 
CH13EDT 
CH04MAL 
CHOICOH 
CH19COP 
CHOICOH 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME < 


ORIENTATION 


CLONE ID 


LIBR.ARY 


145 L 


142367 


RTA00002927F.h. 11.1 .P.Seq 


F 


M00039630D:B07 


CH12EDT 


1452 


19284 


RTA000O2889F.e. 10. l.P.Seq 


F 


M00001537B:H10 


CHOICOH 


1453 


24011 


RTA00002924F.C. 17. l.P.Seq 


F 


M00039440C:G06 


CH09LNL 


1454 

1455 

1456 

1457 

1458 

1459 

1460 

1461 

1462 

1463 

1464 

1465 

1466 

1467 

1468 

1469 

1470 

1471 

1472 

1473 

1474 

1475 

1476 

1477 

1478 

1479 

1480 

1481 

1482 

1483 

1484 

1485 

1486 

1487 

1488 

1489 

1490 

1491 

1492 

1493 

1494 

1495 

1496 

1497 

1498 

1499 

150C 


5930 
21581 
3662 
4873 
11214 
22888 
15490 
112819 
19688 
15132 
25022 
16303 
16828 
14295 
1979 
36248 
5676 
1239 
7937 
4483 
7796 
17330 
25620 
20601 
6205 
726 
104999 
30321 
5878 
5944 
5796 
3804 
2728 
3804 
3932 
16691 
15430 
5637 
16633 
21826 
22193 
10720 
22491 
10423 
4953 
185567 
25605 


RTA0000291 lF.f.08. l.P.Seq 
RT.A00002902F.c05. l.P.Seq 
RTA00002925F.C.07. 1 .P.Seq 
RTA00002930F.b.05. l.P.Seq 
RTA00002896F.h.01. l.P.Seq 
RTA000O2892F.1.09. l.P.Seq 
RTA00002925F.k.08. l.P.Seq 
RTA00002905F.0. 13.1 .P.Seq 
RTA00002896F.I.02.1.P.Seq 
RT A00002922F.n.20. 1 .P.Seq 
RTA00002914F.L21. l.P.Seq 
RTA00002888F.b. 12. 1 .P.Seq 
RTA00002S97F.b.04. 1 .P.Seq 
RTA0000292 IF.a. 1 8. 1 .P.Seq 
RT A000029 30F. f .06. 1 . P.Seq 
RTA00002888F.2.05. l.P.Seq 
RTA00002926F.b.22.2. P.Seq 
RTA00002887F.O.2 1. l.P.Seq 
RTA000029 17F.2.22. 1 .P.Seq 
RTA0000291 lF.d.''''.^. P.Seq 
RT A00002925 F.c .05 . 1 . P.Seq 
RT A000029 1 5F.a.03. 1 . P.Seq 
RTA00002902F.f.09. l.P.Seq 
RTA00002923F.1.20. l.P.Seq 
RTA00002923F.2.21. l.P.Seq 
RTA00002913F.b. 16. l.P.Seq 
RTA00002908F.2. 17. l.P.Seq 
RT A000029 19F.0. 17. 1 .P.Seq 
RTA000029 13F.a. 16. 1 .P.Seq 
RTA00002905F.m.07. 1 .P.Seq 
RTA00002908F.i.2 1. l.P.Seq 
RTA.00002935F.m.24. l.P.Seq 
RT A000029 1 8F.a.22. 1 .P.Seq 
RTAO0OO2935F.n.Ol. l.P.Seq 
RTA00002915F.0. 19.2. P.Seq 
RTA00002S9 1F.O.03. 1 .P.Seq 
RT A00OO290OF. 2. 10.1. P.Seq 
RTA00002925F.b. 1 8. 1 .P.Seq 
K 1 AUvaJU— oV / r.a. i J. i - r jc^ 
RTA00002898F.2. 06. l.P.Seq 
RT A000029 19F.L09. 1 .P.Seq 
RTA00002S98F.C. 14. 1 .P.Seq 
RTA.00002925F.m.06.1.P.Sec 
RTA00002915F.n. 13.2. P.Seq 
RTA000029 l6F.h. 1 1 . l.P.Seq 
RT A000029 1 lF.p.OS. 1 .P.Seq 
RTA00002924F.m.22. l .P.Sei 


F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

F 

1 F 
F 
F 
F 
F 
F 
F 
F 
F 
F 

L_ F 

F 
F 
F 
F 
F 


M00026910B:G06 

M00005822C:A04 

M00039826D-.E04 

M00042719A:G08 

M00004161A:E08 

M0000383~C:D10 

M00039932B:A07 

M00021676C:G03 

M00004179D:A12 

M0003913SB:G05 

M00028219B:H05 

M00001438A-.E01 

M00004214A:E05 

M00033296C:C11 

M00055725D:D09 

M00001460C:EIO 

MO0040O75B:AO5 

M0000142SB:C10 

M0003272SD:F01 

M000268563:G03 

M00039826B:F09 

M00028616C:D09 

M00006631C:A04 

M00039326A:G07 

M0003925SC:C01 

M00027734D:C03 

M000224353:G12 

M00033264B:E06 

M000276SSC:C01 

M00021649B:A02 

M0002245~A:G05 

M00055254A:H03 

M0003282SA:A06 

M00055254A:H03 

M0003251~C:E10 

M000037SCA:G01 

M00005003D:C02 

M00039S20B:F06 

M000042463:H07 

M00004344A:Gl I 

M00033UcD:A03 

M000043:CC:E07 

M00040003A:G10 

M0003250"D:G08 

M000325ScC:B04 

M0002717S3:AU 

M0003971C3:A01 


CH04MAL 
CH02COH 
CH09LNL 
CH15CON 
CHOICOH 
CHOICOH 
CH09LNL 
CH03MAH 
CHOICOH 
CH09LNL 
CH08LNH 
CHOICOH 
CHOICOH 
CH09LNL 
CH15C0N 
CHOICOH 
CH09LNL 
CHOICOH 
CH08LNH 
CH04MAL 
CH09LNL 
CH08LNH 
CH02COH 
CH09LNL 
CH09LNL 
CH04MAL 
CH03MAH 
CHOSLNH 
CH04MAL 
CH03MAH 
CH03MAH 
CH17COHLV 

CHOSLNH 
CH17C0HLV 
CHOSLNH 
CHOICOH 
CH02COH 
CH09LNL 
CHOICOH 
CHOICOH 
CHOSLNH 
CHOICOH 
CH09LNL 
CHOSLN'H 
CHOSLNH 
CH04MAL 
CH09LNL 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


CLUSTER 


SEQ NAME 


ORIENTATION 


CLONE ID 


LIBRARY 


1501 


29446 


RTA00002906F.m.24.1.P.Seq 


F 


M00022070B:B04 


CH03MAH 


1502 


9668 


RTA00002908F.g.02. LP.Seq 


L F 


M00022421A:F12 


CH03MAH 


1503 


29446 


RTA00002906F.O.01 . 1 .P-Seq 


F 


M00022070B:B04 


CH03MAH 


1504 


7171 


RTA00002887F.m.22. 1 .P.Seq 


F 


M00001421B:E07 


CH01COH 
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Table 3 



SEQ 
ID 
I 

2 
3 

4 ■ 

5 
6 
7 
3 
9 
10 


Nearest N< 

ACCESSION 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


■iahbor (BlastN vs. Gei 

DESCRIPTION 

<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


ibank) 

P VALUE 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


Nearest Neiehboi 

ACCESSION 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


fBtastX vs. N'on-Redundani Prot 

DESCRIPTION 1 

<NONEp 

<N'ONE> 

<NONE> 

<NONE> 

<NONE> 

<NONE> 

<NONE> ~~~ 

<NONE> 

<NONE> 

<NONE> 

<NONE> i 


einsi 

> V.ALUE 
<NONTE> 
<NON"E> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


LI 
12 
13 
[4 

15 
16 
17 
IS 
19 
20 
21 
22 

23 


<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE:> 
<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 
<IMONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 

<N"ONE> 


<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 

543562 


<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<N'ONE> 
<NONfc> 
<NONE> 
<NONE> 
<NONE> 

<NONE> j 

[CONTAINS: RNA 
ocdi rrnF ■ HFI [CASE" 
COAT PROTEIN] 2.7.7.43) - 
apple stem grooving virus 
(strain P-209) 


<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 

9.2 


24 


<NONE> 


<NONE> 


<NONE> 


416959 


EXCISION REPAIR PRO 1 hlN 

T?0 /T" ' A HNj A r/*n*iir helicQSd 

ERCC6 - human >gi| 1 82 1 S 1 
(L04791) excision repair protein 
[Homo sapiens] 


3.9 


25 


<NONE> 


<NONE> 


<NONE> 


3327096 


(ABO 14541) KIAA0641 protein 
THorno sapiens 1 


8." 


26 


<NONE> 


<NONE> 


<NONE> 


36 1293 


(U28741) F35D2.1 gene 
product [Caenorhabditis 
eleeansl 


7.9 


27 


<NONE> 




<NONE> 


3297821 


(AL031032) e\tensin-like 
protein 


5.5 


2S 


<N'ON'E> 


<NONE> 


<NONE> 


21 19692 


transforming growth factor-beta 
type HI receptor - chicken 
>»i|51 1843 (L0 1121) 
transforming growth factor-beta 
type III receptor [Gallua sallus', 
protein kinase PRK1 - human 


5.1 
5.0 


29 


<NONE> 


<NONE> . 


<NONE> 


21 3602 S 







Ho 



WO 01/02568 
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' Nearest Neighbor ( Bja stNjj s. Oenban ^ 



SEQ 

ID {ACCESSION 



-SO 1 <NONE> 



DESCRffTION_ 



— ^^mjXvs Non-Redundant pS 



P VALUE 



ACCESSION, 



DESCRIPTION IP VALUE 



31 1 <NONE> 



<NONE> 



<NONE>_UiONE>J 



2358287 



(AF040659) No definition line 
frmtr i f r-,.nr,rhabditis elegansL 

™ ■ «.nj\ If d rUAmn 



(AFO 10404) ALR [Homo 
fz&'SSll) predicted using 



4.6 



- rT ^Mt^ <NONE> 



3877816, 



<NONE> 



<KONE> 
<NONE> 
<NONE> 



<NQNE> 
<NONE> 
<NONE>. 



V <NONE> <NONE> 



<NONE> 



38 1 <NONE>_ 



<NONE>. 



39 1 <NONE> 



<NONE>_ 



<NONE> 
<NONE> 



AO I <NONE> 



<NQNE> 



<NONE>_ 



41 1 <NONE>. 



<NONE> _ 



4140268 



121073 



1718298 



2352538 



3192897. 



561645 



Genefinder.cDNAEST 
EMBL.D655 16 comes from this 
gene.cDNA ESTykl91a5.5 
comes from this gene 
ICacnorhabd itis clcgansj 

■^•^^ r-««-% r~" TS A n ivi 1 1 Tl 



v^u.v***'-'*" ; ■ 

{V 14953) SRCR domain 
membrane form 2 



mem""" "- — , 

(U51183)transposase [Hydra 

vulgaris] 



4.5 



4.4 



4.1 



' (U45958) pistil extensm-hke 
U R i[ nn_ protein FNiCOtiaj^] 

- Li£ glucocorticoid 



4.0 



3.9 



3.9 



(U75698) ORF 45; contains an 
extended acidic domain; EBV 
BKRF4 homolog [Kaposi's 
sarcoma-associated herpesvirus; 
homolog, conserved in other 

p smma-her pesviruses _ 

UAFO06564) alcohol 
dehydrogenase [Drosophila 
pu.rcimilisl persimilisj 



pel aum»>j 1 \ — 

(AF066071)SP85;fsB 
nictvostelium discoideuml 



2.6 



1-4 



1.4 



<N QNE> I 3878S57 . 



(L33421) This CDS feature is 
included to show the translation 
of the corresponding V.region. 
Presently translation qualifiers 

onV rr r™ f -™ iresareillegal 
TZ83120) predicted using 
GenefindencDNAEST 
EMBL.D35016 comes from this 
gene; cDNA EST 
EMBL-.D32583 comes from this 
gene; cDNA EST 
EMBL:D35258 comes from this 
gene; cDNA EST 
EMBL.C1 1471 comes from this 
L ne -. cDNA EST EMB UC^__ 



1.0 



1.0 
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SEQ 
ID 



Nearest Neighbor ™--«N vs. Genbank) 



ACCESSION 



DESCRIPTION ' 



P VALUE 



"Nearest Neighbor ftttoiX vs. Non- Redundant Prote.nT 



ACCESSION 



DESCRIPTION 



1 P value] 



42 I <NONE> 



43 



<NONE> 



<NONE> 



45 | <NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



46 I <NONE> 



47 I <NONE> 



48 I <NONE> 



49 I <NONE> 

50 I <NONE> 



<NONE> 



<NONE> 
<NONE> 



1658571 

2338034 
3043714 



norvegicus . — _ — 

(AF005370) putative immediate 
early protein [Alceiaphine 

herpesvirus 11 

(AB0U167) KIAA0595 protein 



<NONE> 



1723710 



(U7S903) UGT1A7 [Rattus 



HTFOT 

PROTEIN EN ASN2-PHB1 
INTERGENIC REGION 
>gi|213l678|pir||S64439 
hypothetical protein YGR130c - 
yeast (Saccharomyces 
cerevisiae) 

>gi|t323215|gnl|PID|e243523 
(Z72915) ORE YGR130c 
[Saccharomyce s f erevisiael 



1.0 



0.86 



<NONE> 



1723710 



<N0NE> 



<NONE> 



<NONE> 
<NONE> 



52 1 <NONE> 



<NONE> 



<NONE> 



<NONE> 



2996117 



<NONE> 



4151809 



<NONE>, 
<NONE> 



2773341 
1653522 



PROTEIN IN ASN2-PHB 1 
INTERGENIC REGION 
>gi|213167S|pir||S64439 
hypothetical protein YGR130c - 
yeast (Saccharomyces 
cerevisiae) 

> g i|13232l5|gnl|PID|e243523 
(Z72915) ORF YGR130c 
ISaccharom vces cerevisiael 



AF046125) immediate early 2 
Rat cytomegalovirus) 



(AF102855) synaptic SAPAP- 
interacting pr otein Synamon 



0.40 



0.38 



0.26 



(AF040954) putative protein 
phosphatase 1 nuclear targeting. 
subunit fRattn s norvcgicus] 
(D90914) hy pothetical protein 
HYPOTHETICAL lOO.b KL» 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6 04C IN CHROMOSOME 



<NONE> 



3219965 



4185567 



(API 15480) cAMP-dependent 
Rapl guanine-nucleotide 
r,rhnn»e factor fMus musculus] 



0.024 



3e-06 



7e-07 
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■ M-iohhnr fRI.-wtX vs. Non-Redundant Proteins) 1 


SEQ 
1 ID 


Nearest Ne 
ACCESSION 


iphbor (BlastN vs. Uer 
DESCRIPTION 


iDanKi 
P VALUE 


ACCESSION 

h 


DESCRIPTION I 
YVOTHM ICAL 43.5 KD 


' value! 


53 1 


<NONE> 


<NONE> 


<NONE> 1 


[PROTEIN C34E 10. 1 IN 
CHROMOSOME in 
>gi|500724 (U 10402) C34E10.1 

Igene product (Caenorhabditis 
1 176527 elegansl 


3e-20 1 


54 I 


( 

X85444 I 


3.pallida repetitive 
3NA element 


50 J 


' Ibeta-globin - chimpanzee 

2118936 (fragment) . 


8.6 1 


55 1 


I 
1 

X72961 


Synechococcus sp. 

2pfio» CpCr\ gcnca anw 

QRF3 


5.0 


w 
It 
1 

462569 


\SSOCIATED PROTEIN 1A 
nicrotubule-associated protein 
VIAPlA-rat>gi|205538 
lorveeicusl 


2.2 I 






Human wu repeat 
protein HANI 1 
nyRNA. complete cds 


5.0 j 


1 

3875538 


(Z67990) similar to cuticle 


1.3 I 


56 


U94747 


Homo sapiens 
integrin alpha-7 
m DMA comolete cds 


5.0 


2147194 


collaeen - Paralvinelta frasslei 


0.002 


I ^ 


Z50798 


G.gallus mRNA for 
r>52 


5.0 


1 3122885 


ASPARTYL-TKNA 
SYNTHETASE synthetase 
rRnr-iiiiic cnhtilisl 


3e-ll 


58 


AB002384 


Human mRNA tor 
KIAAuooo gene, 
complete cds 


5.0 


2632098 


(Y 155 13) Prodos protein 
(DrosoDhila melanoaasterl 


9e-12 


59 
60 


1 X14835 


Thermofilum pendens 

IriXJA Sr\r nnd 

iufna lor iDj *iiiu 
|23S ribosomal RNA, 
ItRNA-Met, and tRNA 
Gly 


4.9 


1 <N0NE> 


<NONE> 


<NONE> 




1 U87I49 


IHordeum vulgare 
Inucellin gene, 
[complete cds 


4.9 


1 128578 


NONSTRUCTURAL 
PROTEIN NS-S spotted wilt 
(virus (strain CPNH1) non- 
1 structural protein [Tomato 
lspotted wilt virus] . 


2.8 


61 


1 DS7541 


Imus musculus gene 
Ifor integnn alpha v 
Isubunit. promoter 
Ireeion 


4.9 


I 136956 


HYPOTHETICAL PROTEIN 
UL61 cytomegalovirus (strain 
AD 169) cytomegalovirus] 


0.038 


62 
1 63 


j U72520 


IMus musculus mcna 
(protein (Mena) 
mRNA. complete cd 


; 4.9 


| 3413892 


(AB007934) KIAA0465 proteit 
[(Homo sapiens] 


1 

6e-07_ 



WO 01/02568 
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SEQ 
ID j 


Nearest Ne 
\CCESSION 


iahbor (BlastN vs. Genbank) 
DESCRIPTION j P VALUE 


Nearest Nciehbot 
ACCESSION 


(BlastX vs. Non-Redundant Prot 
DESCRIPTION | ] 


eins) 

3 VALUE 


64 1 


e 

i 
r 

c 
i 

S79797 i 


nzymaoc 1 
,lycosylation- j 
egulating gene [rats, j 
Sprague-Dawley, 
treptozotocin 1 
iiabeiic. heart, 1 
tiRNA. 5010 nil 


4.8 


<NONE> 


<NONE> 


<NONE> . 


65 


ABO 11102 


•iomo sapiens mRNAl 
forKIAA0530 
)fotein. partial cds j 


4.8 


138022 


RECEPTOR RECOGNLZ.INU 
PROTEIN gp38 - phage Ox2 
>gi|15126 (X05675) gene 38 
(AA 1-266); pid:g 15 126 
[Bacteriophage 6x2] 


3.6 


66 


AF100985 


'enaeus monodon 1 
phosphopymvate | 
tiydratase mRNA. 
complete cds ] 


A O 

4.8 


500615 


(D16221) endochitinase [Oryza 
satival 


2.8 


67 


> U31756 


Bacillus subtilis 
gamma- 
aminobutyrate 
permease cds 


4.8 


3880699 


(AL02147 1) similar to 
Eukaryotic aspartyl proteases 
Caenorhabditis elegans] 
iukaryotic aspartyl proteases 
fCaenorhabditis elegansl 


2.8 


68 


1 U25111 


Pisum sativum 
chloroplasi 
processing enzyme 
mRNA. nuclear gene 
encoding chloroplast 
protein, complete cds. 


4.8 


1800145 


(U83658) FH1/FH2 protein 
homoloe rEmericella nidulans] 


1.6 


69 


1 U00454 


Mus musculus Cdx-2 
homeobox protein 
gene, complete cds. 


4.7 


<NONE> 


<NONE> 


<NONE> 




1 M84166 


Hamster c-Ha-ras 
protein gene, 
complete cds. 


4.7 


1710606 


REN1N B1NDING PROTEIN 
(RNBP) protein {Rattus 
norveeicus] 


0.88 


1 70 
71 


1 AF0S7516 


Mus musculus major 
sperm fibrous sheath 
protein Pro- 
mAKAP82 gene, 
alternative splice 
cxons 1' and 1" 


4.6 


<NONE> 


<NONE> 


<NONE> 




j X74160 


M.esculenta mRNA 
for granule-bound 
starch synthase 


4.6 


<NONE> 


<NONE> 


<NONE> 


72 
73 


M97487 


Haloferax volcanii 
superoxide dismuiase 
(sodl) gene, complet 
cds. 


el 

1 4.6 


2623307 


(AC002409) putative ubiquitin 
protease (Arabidopsis thaliana] 


3.4 



I If if 



WO 01/02568 
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SEQ 
ID 



Nearest Nei ghbor (BlaslN vs. Genbank) 

DESCRIPTION I P VALUE 



ACCESSION 



75 



76 



77 



78 



79 



M57889 



D49708 



Drosophila 
mclanogaster 
suppressor of sable 
ene. complete cds- 
Rattus norvegicus 
mRNA for RNA 
binding protein 



84 



D31853 



Z47036 



L 19660 



4.5 



4.5 



Yeast GTS 1 gene for 
glycin-thrconin/serinel 
repeat protein, 
complete cds 



4.5 



Human partial cDNA | 
sequence, clone 
bs613j 



2.9 



Rattus norvegicus 
gastric inhibitory 
peptide receptor 
mRNA, complete cds I 2.1 



X82841 



A thaliana Acq gene 



80 1 X61931_ 



S.purpurascens famA 
and farnB genes for 
FAS domain and acyl 
CoA-dehydrogenases 
respectively 



82 I AB0O7869 

83 1 X97479 



X98374 



85 I AE000710 



2.6 



Human lactate 
dehydrogenase-C 
(LDH-C) mRNA, 
complete cds. 
Homo sapiens 
KIAA0409 mRNA. 
partial cds 
H.sapiens mas proto- 
oncogene. 5' region 



2.5 

2.4 
2.1 



R.norvegicus mRNA 
for KIS protein 



Aquifex aeolicus 
section 42 of 109 of 
the complete genome 



L.9 



Nearest Neighbor fBlastX vs. N»n-R e dundant Proteins) 



ACCESSION 



DESCRIPTION 



1.9 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



2447195 



(U42580) NETTF (7x), DETTS 
(4x) [Paramecium bursaria 
IChlorella virus 11 



<NONE> 



<NONE> 



<NONB>l 



3.3 



<NONB>| 



2358279 



|(AF0O787I) torsinA [Homo 
[sapiens] 



483212 



immediate-early protein El 10 - 
human herpesvirus 1 (strain 
HFEM1 (fragment) 



(U95031) sublingual gland 
7090^34 mucin [Homo sapiens] 



2887449 

3130157 
<NONE> 



(AB007874) KIAA0414 [Homo 
I sapiens] 



(AB008859) pheromone 
I roccptnr TFueu rubripesl 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



2e-07 



8.4 



0.47 



31 



<NQNE>1 



<NONE>] 



(IS 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN v s. Genbank) 



Arr-F<;sTr>N| DESCRIPTION 



86 



91 



D30612 [partial cds 



i Homo sapiens mRNA 
for repressor protein 



iHomo sapiens 
PMP69 gene, exons 
87 1 Y 1432 1 18.9,10 & LI 



E.coli genomic DNA 
Kohara clone 

88 1 D90773 #762(30.3-30.5 min.) 
~ lArchaeoglobus 

|fulgidus section 116 

lof 172 of the 

89 | AEO00991 complete genome 
lRatrus norvegicus 
p95 Vav (Vav) proto 
oncogene mRNA. 

90 I U39476 cn mpletecds. 



P VALUE 



(Human transcription 
factor TFUIB 90 kDa 
U28838 Isubunit 



iRattus norvegicus 
Jsynaptotagmin VII 

92 | U20 106 I mRNA. complete cds 

Mouse mammary 
tumor virus putative 
iintegrase, env 
Ipolyprotein. and 
superantigen mRNA, 

93 1 AF071010 [complete cds 



Mesocricetus auratus 
c-fos proto-oncogene 
protein (c-fos) gene, 

94 | AF06 1881 [complete cds 
' .asmodium 

falciparum 

.chromosome 2. 

section 34 of 73 of 

the complete 

95 I AEO01397 I sequence 



1.9 



1.9 



1.9 



1.9 



1.9 



1.9 



1.9 



1.8 



1.8 



barest Neighbor (BlastX vs. Non -R*rl,indant noteTnTT 



ACCESSION, 



DESCRIPTION |P VALUE! 



<NONE> 



<NONE>_ 



I <NONE>] 



<NONE> 



<NONE>_ 



|(D78305) DNA binding protein 
fChlorella virus] 



I <NON£>] 



7.9 



(X79095) 

I py mvate.orthophosphate 
dikinase [Flaveria trinervia] 



2.7 



4158178 



(AL023496) hypothetical 
tefOTHh TlCAL pkdLIWh- 



1.6 



2495730 



RICH PROTEIN KIAA0269 
>gi|1665805|gnl|PlD|dl014089 
[(D87459) Similar to Volbox 
Icarteri extensin (S22697) 
UHomo sapiens] > 0 23 



UL47h protein - Mareks disease 
478380 Ivirus 



|(AC004010) similar to Leucine- 
Irich transmembrane proteins; 
144% similarity to U42767 
|(PID:g 17369 18) [Homo 
2781386 jsapiens] 



0.23 



4e-33 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



I <NONE> 



<NONE> | 



WO 01/02568 
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WO 01/02568 
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SEQ 
ID 



Nearest M-i^wmi^tN vs- Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



]Human polymorphic 
|MspI site DN A 
X5 KD3S3 locus) 



Human P 82(ST5) 
mRNA, alternatively 

106 j tlimo [spliced, complete cds 

._ Homo sapiens 

synaptotagmin Vll 

107 | AF038S3S [m RNA. partial cds 
Homo sapiens clone 
23585 mRNA 

108 L AF052134 [sequence 
|H. sapiens H£K2 

mRNA for protein 

(tyrosine kinase 

[receptor. 



109 I X75208 



110 



111 



jXenopus laevis 
mRNA for SOX-D 
ABO I 3896 complete cds 

[Human HepUi 3' 
[region cDNA.clone 
D 16947 |hmd 6bl0 



1.8 



112 



Mouse DNA, T early 
D13547 alpha (TEA) region . 



113 j M35498 



114 | M84166 



jwoodchuctc c-myc 
protein gene, exon 1. 
Hamster c-Ha-ras 

[protein gene. 

[complete cds 



115 I U33135 



116 I US40O3 



IMychodea carnosa 
18S ribosomalRNA 
jgene, complete 
[ sequence 
Homo sapiens 
putative tumor 
suppressor (BIND 
pene. exons 7-12 



1.8 



1.8 



1.8 



1.8 



1.8 



1.8 



1.8 



1.8 



1.8 



1.8 



- r , nrr .. M^iw (RlastX vs. Non-Redundant Proteins) 



ACCESSION 



1.7 



2136878 



3638957 



457927 



232263 



1730198 



249450 1_ 
3413870 



3393018 



DESCRIPTION 



p value! 



keratin KAP5.5- sheep 
l(fragment) >gi)3 13722 



|(AC004877) sco-spondin-roucin 
like; similar to P98167 uncertain 
I [Homo sapiens] 



3183405. 
3386622 

3334982 

<NONE> 



(U00690) calcium channel alpha 
1 subunit [Drosophila 
rnelanoeaster] 

HOMEOBOX PROTEIN HOX- 
Dl (HOX-4.9) 

GROWTH-ARREST-SPECinc| 
[PROTEIN 1 gene product 
IfHomo sapiens] 



TRANSCRIPTION FACTOR 
FKH-4 factor [Mus musculus] 



0.65 



0.64 



0.51 



0.28 



0.22 



(AL031 174) hypothetical 

te v PUlHEllLALilJiaJ ~ 
PROTEIN C2C6-07 IN 
CHROMOSOME I 
>gi|2370504|gnl|PID]e339l94 

|pombe] 

>gi|345l305|gnl|PIDlel3 16730 
(AL031324) very hypothetical 
Jprotein [Schizosaccharomyces 
pombe] 

l(AC004665) unknown protein 
If Arabidopsis thaliana] 

(AC005306) R27216J [Homo 



<NONE> 



0. 17 



(AB007923) KIAA0454 protein 
IfHomo sapiens] ; ' 0 002 



5e-08 



8e-10 
2e-10 

3e-22 

£NONE>| 
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[sEQ 
1 10 


Nearest N< 
ACCESSION 


;iehbor(BlasiN vs. Ge 
DESCRIPTION 


nbankl 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


117 J 


i 

( 

AE001121 i 


3orreIia burgdorferi 
section 7 of 70) of 
lie complete oenome 


1.7 


<NONb> 


<NONE> 


<NONE> 


118 1 


AE001114 


<^rchaeoglobus 
rulgidus section 165 
of 172 of the 
complete senome 


1.7 


<NONb> 


<NONE> 


<NONE> 


119 


U82064 


Angiosffongylus 
cantonensis adult- 
specific muscle 
protein- 1 gene, partial 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


120 


AF041836 


Buchnera aphidicola 
plasmid pLeu-Sg. 
complete plasmid 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


121 


M87479 


Lymnaea stagnalis 
FMRFamide gene, 
mature peptides. 


1.7 


<NONE> 


<NONE> 


<NONE> 


122 


M55163 


Xenopus laevis 
fibroblast growth 
factor receptor 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


123 


S57565 


histamine H2- 
receptor [rats. 
Genomic. 1928 ml 


1.7 


<NONE> 


<NONE> 


<NONE> 


124 


1 M27256 


Simian 

immunodeficiency 
virus (SIV) pol 
region. 


1.7 


<NONE> 




<NONE> 


125 


J U31516 


Human chromosome 
8 anonymous clone 
pBS8-165 


1.7 


<NONE> 


<NONE> 


<NONE> 


126 


J X 12671 


Human gene tor 
heterogeneous 

HUU ICol 

ribonucleoprotein 
(hnRNP) core protein 
Al 


1.7 


<NONE> 


<NONE> 


<NONE> 


127 


1 AF009054 


Paeonia suttruticosa 
ssp. spontanea 
alcohol 

dehydrogenase IB 
(AdhlB) gene, pania 
cds 


1 

1.7 


<NONE> 


<NONE> 


<NONE> 



\H1 
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Nearest Neighbor (BbstN vs. GenbanJO 



"Nearesl Neighbor mtastX vs. Non-Redundant Proteinsj 



SEQ 

ED 1 ACCESSION 



DESCRIPTION 



P VALUE I ACCESSION 



DESCRIPTION 



P VALUEI 



123 1 AF046917 



IMus musculus 
Itransketolase gene, 
lexon 6 and partial cdsl 



129 1 D89033 



130 1 U57968 



Homo sapiens mKMAl 
lfor Acyl-CoA 
[synthetase 3, 
complete cds 
Staphylotnermus 
marinus surface layer-l 
(associated STABLE 
[protease gene, 
I complete cds. 



132 1 X04980 



(Bovine herpesvirus 1 
(clone p95) UL24 
homologue gene. 

[complete cds. 

Drosophila simulans 
retrotransposon 297 

15 -LTR and flanks 
(pWK102C» 



133 1 AE001I14 

134 1 X04434 



135 I U07890 



Archaeoglobus 
|fulgidus section 165 
of 172 of the 
icomplete genome 
(Human mRNA for 
insulin-like growth 
S factor 1 receptor 
Mus musculus 
|C57BL/6J epidermal 
isurface antigen 
i(mesa) mRNA. 
icomplete cds. 



IHuman tyrosinase 
(gene. 5'-flanking 
Iregion cell-specific 
136 1 D26163 transcription) 



1.7 



<NONE> 



<NONE> 



<NONE>| 



1.7 



<NONE> 



<NONE> 



137 I AF093818 



iFanorpa nipponensis 
NADH 

dehydrogenase 

isubunit 5 gene, 

jmitochondrial gene 

encoding 

mitochondrial 

i protein, partial cds 



1.7 



<NONE> 



<NONE> 



1.7 



1.7 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



1.7 



1.7 



1.7 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE>_ 



<NONE> I 



<NONE>| 



<NONE>| 



<NONE> 1 



<NONE>| 



<NONE> 



1.7 



<NONE> 



<NONE> 



1.7 



<NONE> 



<NONE> 



<NONE>| 



<NONE>] 



<NONE>| 
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SEQ 
ID 


Nearest N 
ACCESSION 


iishbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION | 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


138 


l 

( 

D50560 


Kenopus laevis 
uRNAfor 
:ytochrome P-450, 
:omplete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


139 


AF083488 


VI us musculus 
phospholipaseDl 
PLD1) gene, exons 
18 and 19, complete 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 


140 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


. 1.7 


. <NONE> 


<NONE> 


<NONE> 


141 


M73749 


Streptococcus 
salivarius 

thermophilus beta-D- 
galactose (lacZ) gene, 
complete cds. > 
gb|M63636|STRLAC 
ZZ Streptococcus 
thermophilus beta-D- 
galactosidase (lacZ) 
gene, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


142 


AE00U14 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.7 


2183023 


(U84971) unknown [Homo 

snpiens] ,.,..,„ 


9.2 


143 


L01983 


Human type IV 
sodium channel alpha 
polypeptide 


1.7 


130504 


GENUMJc HJL V HUU 1 bli^ 
[CONTAINS: N-TERMTNAL 
PROTEIN (PI); HELPER 
COMPONENT PROTEINASE 
INCLUSION PROTEIN (CI); 6 
KD PROTEIN 2 (6K2); 
fiFNinME-LINKED PROTEIN 
(VPG); NUCLEAR ... virus 
(strain D) 


9.2 


144 


LI9731 


Plecotus rafinesquii 
mitochondrial 
cytochrome b gene. 5 
end. 


1.7 


3327096 


(AB014541) KIAA0641 protein 
[Homo sapiens] 


9.1 


145 


AE001U4 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete cenome 


1.7 


2183023 


(U84971) unknown [Homo 
sapiens] 


8.8 



(SI 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


(teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















146 


L27218 


tannic cpn 1 m 

amine oxidase 
mRNA, complete cds. 
> oxidase=amiloride- 
binding protein 
homolog [canle, liver, 
mRNA, 2664 ml 


1.7 


1174459 


SIGNAL TRANSDUCER AND 
ACTIVATOR OF 
TRANSCRIPTION 6 (IL-4 
STAT) >gi|559855 (U16031) D. 
4 Stat [Homo sapiens] 


7.1 


147 


Z49868 


Caenorhabditis 
elegans cosmid 
W07E1 1, complete 
sequence 
[Caenorhabditis 
eleeans) 


1.7 


4204263 


(AC0O5223) 40409 
[Arabidopsis thaliana) 


6.7 


148 


AL022271 


L-aenornaoauis 
elegans cosmid 
F32F2, complete 
sequence 
[Caenorhabditis 
elesans] 


1.7 


2497969 


PERIPLASMIC NITRATE 
REDUCTASE PRECURSOR 
>gi|1086107|pir||S5O163 nitrate 
reductase large chain precursor, 
periplasmic - Thiosphaera 
pantotropha >gi|600093 
t7'Kf*T7'%\ n/»ri nlnsmic nitrate 
reductase large subunit 
[Paracoccus denitrificans] 


6.7 


149 


U43844 


Mus musculus cyclin 
D3 gene, complete 
cds 


1.7 


3861490 


(AF062037) capsid protein 
precursor [Thosea asisna virus] 


5.1 


150 




S.cerevisiae UNF1, 
LTV1.MRP8,CYB3 
andTGLl genes. 


1.7 


1255404 


(U53151) weak similarity to 
cytochrome b [Caenorhabditis 
eleeans] 


4.1 


151 


U77846 


Human elastin gene, 
partial cds and partial 
3'UTR 


1.7 


3355682 


(AL031 124) putative secreted 
Ivase 


4.0 


152 


X62880 


S.scrofa mRNA for 
calcium release 
channel (CRC) 


1.7 


3327080 


(ABO 14533) K1AA0633 protein 
[Homo sapiens] 


4.0 


153 


Y0Q067 


Human gene for 
neurofilament subunit 
M (NF-M) 


1.7 


. 479829 


heterogeneous ribonuclear 
particel protein homolog - 
Caenorhabditis elegans 
similarity to RNA recognition 
motifs [Caenorhabditis elegans] 


3.9 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Pre 


jteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















154 


X68393 


D.melanogaster gene 
for Beta-tubulin, 
exons 1 and 2 


1.7 


2342682 


(AC000106) Contains similarity 
to Rattus AMP-activated protein 
kinase (gb|X95577). 
f Arabidopsis thaliana] 


3.8 


155 


ABO 12284 


Shuttle vector 
pAUR123 gene for 
Aur.l-C, complete cds 


1.7 


417704 


POL POLYPROTEIN 
(ORF1A/1B) (CONTAINS: 
RNA-DIRECTED RNA 
POLYMERASE ; HELICASE; 
PROTEASE ] 


3.8 


156 


M96633 


Rattus norvegicus 
mitochondrial 
intermediate 
peptidase (MIP) 
mRNA. complete cds. 


1.7 


2314209 


(AE000613) H. pylori predicted 
coding region HP 1054 


3.1 


157 


U49055 


Rattus norvegicus 
CTD-binding SR-like 
protein rA8 mRNA, 
complete cds 


1.7 


2497252 


INSULl>T-LlKJb UKUWTH 
FACTOR BINDING PROTEIN 
4 (IGFBP-4) (IBP-4) (IGF- 
BINDING PROTEIN 4) factor- 
binding protein-4 - sheep 
(fragment) factor-binding 
proiein-4, IGFBP-4 [sheep, 
liver. Peptide, 237 aa] [Ovis 
aries] 


3.0 


158 


Y 15907 


Mus musculus mRNA 
for myc-intron- 
binding protein- 1 


1.7 


912776 


iduronate-2-sulfatase, IDS (EC 
3. 1 .6. 13 } Peptide Mutant. 550 
aa| 


3.0 


159 


U67600 


Methanococcus 
jannaschii section 142 
of 150 of the 
complete genome 


1.7 


2982355 


(AF052252) fork head domain 
protein FKD9 [Danio rerio] 


3.0 


160 


AFO 13759 


Homo sapiens 
calumein (Calu) 
mRNA. complete cds 


1.7 


2982355 


(AF052252) fork head domain 
protein FKD9 [Danio rerio] 


2.9 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


)teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(.^u-iuw; -numaniy iu - ■ ■— 




161 


AF062915 


Arabidopsis thaliana 
putative transcription 
factor (MYB90) 
mRNA. complete cds 


1.7 


3878065 


Human mRNA yiuUmt 

KIAA0077 (TR:Q 14997); 
cDNA EST yk243h8.5 comes 
from this gene; cDNA EST 
yk243h8.3 comes from this 
gene; cDNA EST yKJ5yn4.3 
comes from this gene 
[Caenorhabditis elegans] 
>gi|38803 1 8|gnl|PID|e 1349839 
(Z8 1 1 33) Similarity to Human 
mRNA product KIAA0077 
(TR:Q 14997); cDNA EST 
yk243h8.5 comes from this 
gene; cDNA EST yk243h8.3 
comes from this gene; cDNA 
EST yk359h4.5 comes from this 
gene 


2.3 


162 


X87526 


H.sapiens genomic 
DNA (chromosome 
3: clone NL3003R) 


1.7 


3638957 


(AC004877) sco-spondin-mucin- 
like; similar to P98167 uncertain 
[Homo sapiensl 


2.3 


163 


AC005573 


Homo sapiens 
chromosome 5, PAC 
clone 202el3 


1.7 


2465540 


(AP005632) phosphodiesterase 
I/nucleotide pyrophosphatase 
beta [Homo sapiens] 


1.8 


164 


D83402 


Homo sapiens gene 
for prostacyclin 
synthase, exon 10 and 
complete cds 


1.7 


627608 


steroid hormone receptor TR3 - 
human sapiens] 


1.7 


165 


AFO53700 


Homo sapiens deltex 
(Dx) mRNA, 
complete cds 


1.7 


2662089 


(AB 007864) KIAA0404 [Homo 
sapiens] 


1.7 


166 


AF043225 


Mus musculus 6- 
pyruvoyl- 
teirahydroptcrin 
synthase (Pts) 
mRNA. complete cds 


1.7 


2352538 


(AF006564) alcohol 
dehydrogenase (Drosophila 
persimilis] persimilis] 


1.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















167 


U52917 


Ihermus aquaucus 
thermophilus NADH 
dehydrogenase I 
subunits NQ07 
NQ06. NQ05, 
NQ04, NQ02, 
NQ01.NQ03, 
NQ08. NQ09, 
NQOIO.NQOU. 

NQ014, complete 
cds. 


1.7 


2564334 


(Aijutioo j I ) I ne human 
homolog of mouse Cux-2 
[Homo sapiens) 


1.0 


168 


X72222 


M.musculus gene for 
serotonin 2 receptor 


1.7 


3875796 


(.£/>»Z3j similarity to i east 1 

i i^puuiciivJj i t.r*y pruicm 
(SW:YIK9_YEAST); cDNA 
EST EMBL:T01252 comes 
from this gene; cDNA EST 
EMBL.D33205 comes from this 
gene; cljn A to i 
EMBL:D33955 comes from this 
gene; cDNA EST 
EMBL.D35484 co... 


1.0 


169 


in 1 9.6. 


Crotalus scutulatus 

PLA2-like 

pseudogene 


1. / 


or 1 rn i 

853971 


(X83413) DR5 [Human 
herpesvirus 6) >gi|853972 
(X83413)DR5 [Human 
herpesvirus 6] 


0.99 


170 


MS3118 


Mus musculus factor 
VHI-associaied 
protein (f8a) mRNA, 
complete cds. 


1.7 


3201617 


(AC004669) hypothetical 
protein [Arabidopsis thaliana] 


0.80 


171 


M38347 


E.coli ATP- 
Jependent proteinase 
Ion) gene, complete 
:ds. 


1.7 


4140322 


[AL03 1 282) dJ2S3E3.3.2 (Cell 
Division Cycle 2-Like 2 
(PITSLRE. p58/GTA, 
Galactosyltransferase 
Associated Protein Kinase)) 
(isoform beta 2-2) [Homo 
sapiens] 


0.78 


172 


] 
i 

U2S838 ■ 


-luman transcription 
'actor TFIIIB 90 kDa 
ubunit 


1.7 


] 

( 
( 

2495730 


HYPOTHETICAL PrOLINE- 
*ICH PROTEIN KIAA0269 
>gi|1665805|gnl|PID|dl014089 
D87459) Similar to Volbox 
:arteri extensin (S22697) 
Homo sapiensl 


0.62 
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Nearest Neighbor (BlastN vs. Cenbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) _ 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















173 


U72487 


Rattus norvegicus 
calcium-independent 
alpha-latrotoxin 
receptor mRNA, 
complete cds 


1.7 


54441 1 


GLYCOPROTEIN GP100 
PRECURSOR (P29F8) 
discoideum] 


0.35 


174 


AE000718 


Aquifex aeolicus 
section 50 of 109 of 
the complete genome 


1-7 


2497569 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 3 
PRECURSOR (FGFR-3) 
(HEP ARIN-B INDING 
GROWTH FACTOR 
RECEPTOR) 

1 1 "7QC f l n :«llT<C'3£^ 

>gl|Zll /ojl[pir||133J0J 

fibroblast growth factor receptor 
3 - mouse >gi| 199 145 (M8 1342) 
fibroblast growth factor receptor 
3 [Mus musculus] 


0.34 


175 


AFO 16897 


Oryza sativa GDP 
dissociation inhibitor 
protein OsGDI2 
(OsGDI2) mRNA, 
complete cds 


1.7 


125362 


m &./ 'unuu aku 1 'i n i )nj v 

Svl/\\^l\\Jr nnut ^ULU-> I 
STIMULATING FACTOR I 
RECEPTOR PRECURSOR 
(CSF-l-R) (FMS PROTO- 
ONCOGENE) (C-FMS) factor 1 
receptor - cat >gi| 163S55 
(J03149) M-CSF receptor [Felis 
domesticus] 


0.34 


176 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 


1.7 


85058 


muscarinic acetylcholine 
receptor - fruit fly acetylcholine 
receptor [Drosophila 
melanosaster] 


0.20 


177 


AF077352 


Chlamydomonas 
rcinhardtii myosin 
heavy chain 


1.7 


728901 


ACROSOMAL PKOIHN SP- 

10 PRECURSOR SP-10 - 
western baboon 
>gi|298488|bbs|127113 
(S56458) SP-10=intraacrosomal 
protein [Papio papio=baboons. 
Peptide. 285 aa] [Papio 
hamadrvas] 


0.20 


178 


Z92788 


Caenorhabditis 
elegans cosmid 
F53B8, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


746516 


(U23517) D1022.7 
Gaenorhabditis elegans] 
>ai!3258651 elegans] 


0.068 



WO 01/02568 



PCTAJS00/18374 



SEQ 
ID 


Nearest N 
ACCESSION 


eiahbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neiehbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 

DESCRIPTION 
X87883) mitochondrial capsule 


teins) 

P VALUE 


179 


AF002217 


Ralstonia eutropha 
megaplasmid pHGl 
nitric oxide reductase 
(norB) gene, 
complete cds 


1.7 


1143538 


selenoprotein [Rattus 
norvegicus] >gi|1354135 
(U48702) mitochondia 

iccnrntpH /*vcf^in<*-Hch DfOtClIl 
aSSOClttlCU l-jraicure nww p*w*srwa 

SMCP 


0.039 


180 


D30749 


Rat mRNA for 
protein tyrosine 
phosphatase 


1.7 


1228035 


(D83776) The KIAA0191 gene 
is expressed ubiquitously.; The 
vta AD1Q1 nmtein retains the 
C2H2 zinc-finger at its N- 
terminal region. [Homo sapiens] 


0.008 


181 


Ml 5202 


Rat fast skeletal TnT 
gene encoding 
troponin T isoforms, 
complete cds. 


1.7 


731172 


SKIN SECRETORY PROTEIN 
XP2 PRECURSOR 


4e-04 


182 


L07592 


Human peroxisome 
proliferator activated 
receptor mRNA, 
complete cds. 


1.7 


4033414 


PUTATIVE IMPORTIN BETA- 
4 SUB UNIT 


2c-06 


183 


U64031 


Dendrobium 
crumenatum ACC 
synthase gene, 
complete cds 


1.7 


3122885 


ASPARTYL-TRNA 
SYNTHETASE synthetase 
[Bacillus subtiiisl 


2e-ll 


184 


AF034970 


Unmfi ^aniens 
docking protein 
(DOK-2) mRNA, 
complete cds 


1.7 


2289097 


(U78737) 

alpha( l,3)fucosyltransferase 
[Cricetulus griseusl 


8e-12 


185 


Z12839 


L.longitlorum mkNA 
encoding calmodulin. 

> :: 

gb|L18912|LILCALy 
ODU Lilium 
longiflorum 
calmodulin mRNA, 
complete cds. 


[ 

1.7 


2511747 


(AF023270) probable 
transcriptional regulator dre4 


4c- 12 



WO 01/02568 



PCTAJS00/18374 



SEQ 
ID 


Nearest N 
ACCESSION 


ei^hbor(BlasiN vs. Gc 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


186 


X53459 


Equine arteritis virus 
(EAV) RNA genome 

> :: 

emb|A45589|A45589 
Sequence 1 from 
Patent W095 19438 > 

emb|A58849|A58849 
Sequence 1 from 
Patent WO9700963 > 

gb|AR013959|AR013 
959 Sequence 1 from 
patent US 5773235 


1.7 




3979817 


1270033) Wuok STrnrhnr; lu 

rluman tyrosine-protein kinase 
CSK (SW:CSK_HUMAN); 

comes from this gene; cDNA 
EST EMBL:C 12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
gene: cDNA EST yk408c2.5 ... 
Human tyrosine-protein kinase 
CSK (SW:CSK_HUMAN). 
cDNA EST EMBL:C 10908 
comes from this gene; cDNA 
EST EMBL:C 12822 comes 
from this gene; cDNA EST 
yk408c2.3 comes from this 
ftene: cDNA EST vk408c2.5 ... 


le-I4 


187 


K02668 


E. coli dd! gene 
encoding D-alanine:D 
alanine ligase and 
ftsQ and ftsA genes, 
complete cds, and 
ftsZ pene, 5' end. 


1.7 


3879121 


(270310) predicted using 
Genefinder; Similarity to Mouse 
ankyrin (PIR Acc. No. S37771); 

r^xrA CCT KXifRT •TflIQ'7'} 

comes from this gene; cDNA 
EST EMBL.D32335 comes 
from this gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... Genefinder; 
Similarity to Mouse ankyrin 
(PIR Acc. No. S37771); cDNA 
EST EMBL:T01923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
EMob^j^'--) comes injiu una 
pene; cDNA ES... 


2e-19. 


188 


AB008375 


Homo sapiens mRNA 
for osteoblast specific 
cysteine-rich protein, 
complete cds 


1.7 


2496945 


HYPOTHETICAL 55.9 KD 
PROTEIN EEED8.6 IN 
CHROMOSOME II >gi|733603 
(U23484) No definition line 
found [Caenorhabditis elcgans] 


le-19 


• 189 


L36603' 


Pseudomonas cepacia 
(clone Psudom70-1) 
heat shock protein 70 
(hsp70) gene, 
complete cds 


1.7 


2661842 


(Y 15732) DNA polymerase beta 
fXenopus laevis] 


i 

6e-20 



/St 
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SEQ 
ID 


Nearest N 
ACCESSION 


eiphbor (BlastN vs. Ge 
DESCRIPTION 


nbank) 
P VALUE 


Nearest Neiehbc 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


190 


Z49760 


P.blakcsleeanus 
mRNA GTP 
cyclohydrolase I 


1.7 


1731181 


PROTEIN H4A4.3 IN 
CHROMOSOME Q 
>gi|3874230|gnl|PID|e 1 35 L o 1 8 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
gene [Caenorhabditis elegans] 


3e-21 


191 


U52428 


Human fatty acid 
synthase gene, partial 
cds 


1.7 


■ 4226073 


(AF125443) contains similarity 
to S. pombc phosphatidyl 
synthase (GB:Z28295) 
rCaenorhabditis elegans) 


6e-25 


192 


U12767 


Human mitogen 
induced nuclear 
orphan receptor 


1.6 


<NONE> 


<NONE> 


<NONE> 


193 


Z63478 


H-sapiens CpG DNA, 
clone S5al2. forward 
readcpe85al2.ftla. 


1.6 


<NONE> 


<NONE> 


<NONE> 


194 


AP084375 


Homo sapiens 
inversin protein, 
exons 8 and 9 


1.6 


<NONE> 


<NONE> 


<NONE> 


195 


AE001114 


Archaeoglobus 
fulgidus section 165 
of 172 of the 
complete genome 


1.6 


<NONE> 


<NONE> 


<NONE> 


196 


AF084375 


Homo sapiens 
inversin protein, 
exons 8 and 9 


1.6 


<NONE> 


<NONE> 


<NONE> 


197 


U24217 


Kluyveromyces lactis 
RNA polymerase II 
largest subunit gene, 
partial cds 


1.6 


' <NONE> 


<NONE> 


<NONE> 


198 


AE0OO58O 


Helicobacter pylori 
26695 section 58 of 
134 of the complete 
genome 


1.6 


<NONE> 


<NONE> 


<NONE> 
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SEQ 
ED , 


Nearest I> 
ACCESSION 


Jeiehbor (BlastN vs. Get 
DESCRIPTION 


ibank) 
P VALUE 


Nearest Neighbo 


(BlastX vs. Non-Redundant Prot 
DESCRIPTION 


eins) 

'VALUE 


199 I 




H.sapiens mRNA for 
Drosophila female 
sterile homeotic 
(FSH) homologue > :: 
gb|M80613|HUMFS 
HG Human homolog 
of Drosophila female 
sterile homeotic 
IiyiRNA comolete cds. 


1.6. 


<NONE> 


<NONE> 


<NONE> 




[Plasmodium 
Ibrasilianum DNA 
(homologous to the 
Ihistidine-rich knob 
protein region of 
Plasmodium 
M28064 falciparum. 


1.6 


457495 


(M26647) ORF X 
[Saccharomyces cerevisiae] 


8.4 


200 


IStreptomyces albus 
(lipase precursor (lip) 
gene, complete cds. 
land unidentified 5' 

i ORF and 3' ORF. 

j U03U4 partial cds. 

Strix varia oocyte 
1 maturation factor 
1 Mos (c-mos) proto- 
1 U88422 oncosene. partial cds 


1.6 


3638957 


(AC004877) sco-spondin-mucin- 

i:!,^. eimiiir tn PQR167 uncertain 
UKe; similar to r7oiui ui»ww»«.»< 

fHomo sapiensl . 


7.8 


201 


1.6 


137618 


VITAMIN D3 RECEPTOR 
(VDR) receptor [Rattus 
norvepicusl 


6.4 


202 


I Human pulmonary 
1 surfactant-associated 

| protein SP-A 

(SFTP1) gene, 
1 M685 19 complete cds. . 


1.6 


3875423 


(238112) E03A3.6 
rCaenorhabditis elegansl 


4.9 


203 


f Homo sapiens 

I ' (transcription factor 

1 AF044575 POU4F3 


1.6 


2133625 


GAB A transport protein - 
tobacco homworm 


4.7 


204 
205 
206 


J" (Homo sapiens 

1 (subclone 3_el0 Iron 

1 PI H21) DNA 

| L48476 sequence. 


l 

1.6 


3687297. 


(AJ005588) 5-epi-aristolochene 
synthase 


4.6 


j Rat CNS 2',3'-cyclic 

1 (nucleotide 3- 

1 M 18630 (phosphodiesterase 


1.6 


3880315 


" (Z8 1 1 33) Similarity to Human 
mRNA product K1AA0077 
(TR:Q 14997) [Caenorhabditis 
eleaans] 


3.7 
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Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



1.6 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



267068 



TUMOR- ASSOCIATED 
ANTIGEN L6 



3.6 



208 



U53448 



Babesia microti heat 
shock protein 70 
(hsp70) gene, 
complete cds 



1.6 



1255429 



(U53155) strong similarity to 
the carboxyl two-thirds of valyl- 
tRNA synthetases 
Caenorhabditis elegansl 



209 



AF084367 



Homo sapiens 
inversin protein 
mRNA. complete cds 



1.6 



1730076 



PROBABLE 
SERINE/THREONINE- 
PROTEIN KINASE CY49.28 
gi|1370255|gnliPID|e247094 

(Z73966) pknJ 



210 



D55635 



Yeast disl+ gene for 
p93disl. complete 

cds • 



1.6 



3128353 



(AF0 10496) maltose transport 
inner membrane protein 



211 



AF035756 



Streptomyces sp. 2- 
dehydro-3- 
deoxyphosphohepton 
ate aldolase gene. 

cds 



212 



X73479 



partial 
O.cuniculus rPTPA 
mRNA 



213 



X98330 



H.sapiens mRNA for 
ryanodine receptor 2 



P.anserina FMR1 
;cne exons 1 and 2 



1.6 



853971 



(X83413) DR5 [Human 
herpesvirus 6] >gi|853972 
(X834I3)DR5 [Human 
herpesvirus 61 



1.6 



3413810 



(Y 17034) Bassoon [Mus 
musculusl 



1.6 



1.6 



2072986 



(U95142) putative G-protein- 
coupled receptor G-protein- 
coupled receptor [Arabidopsis 
thaliana] 



128014 



NECDIN >gi|9i 129|pir||jIN0148 
necdin, brain - mouse 

gi|200020 (M80840) necdin 
rMus musculusl 



2.2 



1.2 



1.2 



0.97 



0.94 



0.73 



0.42 



Z92788 



gei . 

Caenorhatxiitis 

elegans cosmid 

F53B8, complete 

sequence 

[Caenorhabditis 

elegansl 



1.6 



746516 



(U23517) D1022.7 
[Caenorhabditis elegans] 
>gjl3258651 elegansl 



21 



AE000888 



Methanobacterium 
thermoautotrophicum 
from bases 1098908 
to 1112186 (section 
94 of 148) of the 
complete genome 



1.6 



462415 



INTERFERON- ALPHA/BETA 
RECEPTOR ALPHA CHAIN 
PRECURSOR (EFN-ALPHA- 
REC) >gi|3465201pir||S27387 
interferon alpha receptor type 1 
bovine >gi|432 



0.19 



0.001 1 



CP I 
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SEQ 
ID i 


Nearest Ne 
SUCCESSION 


iphbor (BlasuN vs. Genbank) _ 
DESCRIPTION 1 P VALUE 


Nearest Neighbor 
ACCESSION 


(BlasiX vs. Non-Redundant Prot 
DESCRIPTION I 


eins) 

1 VALUE 




I 
f 
c 

AB 008375 < 


iomo sapiens mRNA 
or osteoblast specific 
ysteine-rich protein. 


1.6 


( 

i 
\ 

2496945 f 


lYPOTHETlCAL 55.9 KJJ 
'ROTEIN EEED8.6 IN 
:HROMOSOME n >gi|733603 
U23484) No definition line 
bund (Caenorhabditis elegans] 


le-18 


217 


M25312 


Drang-utan involucrin 
pene. complete cds. 


1.6 


3875131 


Z70750) similar to vanadate 
•esistance protein 
j-ansmembranous domains 
Caenorhabditis elepansl 


3e-26 


218 




Cyprinus carpio 
mRNA for MyoD, 
complete cds 


1.5 


<NONE> 


<NONE> 


<NONE> 


219 


AB012882 
U29487 


Caenorhabdius 
elegans cosmid 
C09C7 


1.5 


<NONE> 


<NONE> 


<NONE> 


220 


X74760 


M.musculus mRNA 
for Notch 3 


1.5 


1364094 


integral membrane protein - 
otreptomyces pri»uiiaG3yu».»i.» 
>gi|S72306 (X84072) integral 
membrane protein 
rCtntTMnmvre 1 ; nristinaespiralis] 

^rjLUCANASbll 


4.3 


221 
222 


U72396 
U42391 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP 17.6 
mRNA. complete cds 


1.5 


121855 


PRECURSOR cellulose 1,4-beta 
cellobiosidase (EC 3.2.1.91) U 
precursor - fungus (Trichoderma 
rpetei'> 1 4-beta-cellobiosidase 
(EC 3.2.1.91) II - fungus 
cellobiohydrolase II 
[Trichoderma recsefl 


4.3 


Human myosin-DCb 
mRNA. complete cds 


1.5 


3688428 


(AJ01 1534) sucrose synthase 


4.2 


?-n 


M92296 


Pongo pygmaeus 
gamma- 1 and gamma 
2 globin genes, 
complete cds. 


1.5 


186413 


(M13144) inhibin A [Homo 
sapiens] — 


0.22 


224 
''25 


8 X94144 


C.japonica mRNA fo 
QNR-7 1 potein 


r 

1.5 


2745737 


(AF029791) UDP- 
Gal:betaGlcN'Ac beta 1.3- 
galactosyltranferase-II [Mus 
musculus] 


3e-08 
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SEQ 
ID j 


Nearest Ne 
\CCESSION 


iohhor (BlastN vs. Ger 
DESCRIPTION 


lbank) 
P VALUE 


Nearest Neighboi 
ACCESSION 


(BlastX vs. Non-Redundant Prot 
DESCRIPTION 1 


eins) 

» VALUE 




! 

I 


-fomo sapiens mRNA 
or KIAA0657 
jrotein, partial cds 


1.5 


I 

1212992 


X90568) Frotein sequence and 
innotation available soon via 

SWISS- rTOl, available »i 

/ia e-mail from 
„ABEIT@EMBL- 
Heidelberg.DE THomo sapiensl 


4e-13 


226 


AB014557 j 


Borrelia burgdorten 
oligopeptide 
permease homolog 
OppAIV (oppAlV) 
gene, complete cds 


1.3 


<NONE> 


<NONE> 


<NONE> 


227 


AF000948 
ACflS7287 


Mus musculus 
RAB/Rip protein 
mRNA. partial cds. 


1.3 


2498005 


MYC PROTO-ONCOGENE 
PROTEIN (C-MYC) proto- 
oncoeene f Sus scrofal 


2.6 


229. 


t\r\/j t wo ' 
U38951 


Drosophila 
melanogaster 
vacuolar ATPase 
subunii E 


1.1 


<NONE> 


<NONE> . 

(U90209) RNA polymerase II 
largest subunit [Bonnemaisonia 
hamiferal 


<NONE> 


Homo sapiens 
myogenic 

determining factor 3 


1.1 


3172134 


2.3 


730 


AF027148 
AP079310 


Mus musculus histone 
dcacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 


1.0 


1657601 


(U66220) unknown 
fNannocystis exedensl 


0.25 


231 




P.radiata lac gene tor 
laccase 


0.95 


996020 


(X91638) BRM protein lUaiius 
pallus] 


0.31 


232 


X52134 
D89016 


Human mRNA for 
Neuroblastoma, 
complete cds 


0.93 


<NONE> 


<NONE> 


<NONE> 


233 


X76392 


C.familians VIP36 
(vesicular integral- 
membrane protein of 
36 kDa) mRNA 


0.93 


4176446 


' (AL022238) dJ1042K10.2.1 
(novel protein with probable 
rabGAP domains and Src 
homology domain 3) 


7e-81 


234 
?11 


AF 100694 


Mus musculus 
Porain52 mRNA, 
complete cds 


0.90 


<NONE> 


<NONE> 


<NONE> 



\0 
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Nearest Neighbor (BlasiN vs. Ue 
SEQ 1 1 

ID 1 accession! description 


tibank) 
P VALUE 


Nearest Neighbo 

ACCESSION 
l 


r (BlastX vs. Non-Redundant Pro 

DESCRIPTION 
ETJTTPKOTEIN HiLLUKSUK 


eins) 

P VALUE 


236 


AE00O991 


Arc haeoglobus 
\ilgidus section 1 16 
of 172 of the 
complete aenome 


0.90 


1176579 


EARLY'Ul TKANSUW1 2j " " 

probable membrane protein 
YNL327w - yeast 
Saccharomyces cerevisiae) 
cerevisiae] 

>gi|1302445|gnl|PUJ|ezJ;o // 
(Z7 1603) ORF YNL327w 
fSaccharomyces cerevisiae] 


6.9 


237 


Z35922 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR053c 


0.86 


<NONE> 


<NONE> 


<NONE> 


238 


U47331 


Rattus norvegicus 
metabotropic 
glutamate receptor 4b 
mRNA. complete cds. 


0.82 


1550703 


(Z80225) hypothetical protein 
Rv2662 


4.1 


239 


X72810 


H.sapiens Ig germline 
kappa-chain gene 
variable reaion (L3) 


0.69 


3023063 


(AF052587) F14 [Xylella 
fastidiosal 


0. / 


240 


Z11700 


Escherichia coli 
genes facG, faeH, 
fael, faeJ and 1S629- 
like insertion 
sequence. > :: 
emb|Z11710|ECFAE 
HII E.coli faeH, fael 
and faeJ genes 
encoding FaeH. Fael 
and FaeJ proteins 


0.69 


' 2347188 


(AC002338) laccase isolog 
f Arabidopsis thalianal thaliana] 


3.9 


241 


U71597 


Phrynosoma 
douglassii NADH 
dehydrogenase 
subunit 4 (ND4) 
gene, mitochondrial 
gene encoding 
mitochondrial 
protein, partial cds 


0.65 


<NONE> 


<NONE> 


<NONE> 
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Nearest M»iohhr.r iBlastN vs. Genbank) 



"Nearest Neighbor (BlastX vs. Non-Redundant Pro teins^ 




242 | Z77798 



Ammonia species 
LSU rRNA gene 
(partial; isolate Tr S 
5: clone 16) . 



243 I D25542 



Human mRNA for 
golgi antigen gcp372 
complete cds 



244 I M80234 



245 AB007918 



246 I X5 1754 



247 I AE001554 



249 AJ223768 



Cow dopamine 
transporter mRNA, 
utative cds. 



Homo sapiens mRNA 
for KIAA0449 
irotein, panial cds 
^uman U266 
rearranged DNA for 
lambda 
immunoglobulin light 
chain 



Helicobacter pylori 
strain 199 section 1 15 
of 132 of the 
complete genome 

Rsapiens CpG DNA, 
clone 96e7, reverse 
readc pp96e7.rtla . 



Pinus sylvestris 
microsatellite DNA. 
clone SPAC11.5 



0.64 



0.64 



0.64 



0.64 



0.63 



0.62 



0.62 



1 174506 



SYNTKL 1 ASh glutamate 
tRNA ligase(EC6.l.l.l7)- 
Haemophilus influenzae (strain 
RdKW20)>gi| 1573240 
(U32713) glutamyl-tRNA 
synthetase (gltX) [Haemophilus 
influenzae Rdl . 



111230 



3874972 



ultra-high-sulfur keratin 1 - 
mouse 



2833239 



2072301 



(Z99709) similar to Elongation 
factor Tu family (contains 
ATP/GTP binding P-loop); 
cDNA EST EMBL:D76223 
comes from this gene; cDNA 
EST ylc478c5.5 comes from this 
;ene [Caenorhabditis elegans' 
iPlDERMAL GROW TH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS8 
>gi|530823 (U12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 



(U95 102) mitotic 
phosphoprotein90 [Xenopus 
laevisl _____ 



0.62 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE>| 
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Nearest Neighbor miMtX vs. Non-Redundant PtOttmsT 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


>teins) 


SEQ 
DO 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AFI25443) contains similarity 




260 


U52428 


Human fatty acid 
synthase gene, partial 
cds 


0.61 


4226073 


to S. pombe phosphatidyl 
synthase (GB:Z28295) 
[Caenorhabditis elegans] 


2e-26 


261 


X 15292 


Plasmodium 
falciparum gene for 
heat-shock protein 
pPf203 


0.60 


<NONE> 


<NONE> 


<NONE> 


262 


Afl 020663 


Homo sapiens mRNA 

for KIAA0856 
protein, partial cds 


0.60 


470341 


(U00043) No definition line 
found rCaenorhabditis elegans] 


5.7 • 


263 


U68723 


Human checkpoint 
suppressor 1 mRNA, 
complete cds 


0.60 


544375 


CATAO'OJsb-HlNDING 
PROTEIN REGULATOR 
glucose/galactose binding 
protein regulator - 
Agrobacterium tumefactens 
>gi| 14222 8 (LI 0424) 
glucose/galactose binding 
protein regulator 


5.7 


264 


M32687 


S.griseus sporulation 
protein genes 1590 
and 1422. 


0.60 


2582017 


(AF012871) Mergla' [Mus 
musculus] 


3.3 


265 


AJ0O5331 


Homo sapiens 
NKCC2 gene, exon 4, 
isoform B 


0.60 


3128353 


(AF0 10496) maltose transport 
inner membrane protein 


1.5 


266 


U14103 


Mus musculus RGL 
protein mRNA, 
complete cds. 


0.60 


4099845 


(U90533) serine protease 
inhibitor [Streptomyccs fradiae) 


0.098 


267 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.59 


3282851 


(AF047897) ankyrin-like protein 
HGE-ANK 'Ehrlichia sp. BDS] 


5.5 


268 


AE0OO872 


Methanobactenum 
thermoautotrophicum 
from bases 896604 to 
912784 (section 78 of 
148) of the complete 
genome 


0.59 


401553 


HYPOTHETICAL 24.5 KD 
PROTEIN IN NADB-SRMB 
INTERGENIC REGION 


4.3 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












liypUllldlK-'ill piUltilll - MUIlliUl 




269 


LU871 


Gallus gallus achaete- 
scute homologue 
(ASH) mRNA, 
complete cds. 


0.59 


628110 


nerpesvirus 4 reading name i 

[Human herpesvirus 4] 2 
[Human herpesvirus 4] 
>gi|1334838|gnl|PID|e25079 4 
[Human herpesvirus 4] 
>gi|1334840|gnl|PID|e25081 6 
[Human herpesvirus 4] 
>gi|1334842|gnl|PtD|e25067 8 
[Human herpesvirus 4] 
>gi|1334844|gnl|PID|e25069 10 
[Human.herpesvirus 4] 
>gi|1334846|gnI|PID|c25071 12 
[Human herpesvirus 4] 


4.2 


270 


AF0171 14 


Oryctolagus 
cuniculus glycogen 
synthase mRNA, 
butii^'t*'**' ^v** 


0.59 


728856 


NITROGEN ASE IRON-IRON 
PROTEIN ALPHA CHAIN 
(NTTROGENASE 
COMPONENT I) 
(DINITROGENASE) capsulatus 
>gi|3 12238 (X70O33) 
alternative nitrosenase 


2.4 


271 


AF027807 


Homo sapiens beta- 
cascin (CSN2) gene, 
complete cds 


0.59 


3252932 


(AF067155) truncated rev. 
protein [Human 
immunodeficiency virus type 1] 


1.5 


272 


U81787 


Human WntlOB 
mRNA. complete cds 


0.59 


3875538 


(Z67990) similar to cuticle 
collaeen 


1.4 


273 


U76036 


Apteryx australis Ibis 
ribosomal RNA gene, 
mitochondrial gene 
for mitochondrial 
RNA, partial 
sequence 


0.59 


4193356 


(AF055088) ATP-binding 
cassette; PsaB [Streptococcus 
pneumoniae] 


0.83 


274 


ABO 14564 


Homo sapiens mRNA 
for KIAA0664 
protein, partial cds 


0.59 


1709851 


PTB-ASSOC1ATED SPLIClNli 
FACTOR (PSF) long form - 
human >gi|38458 (X70944) 
PTB- associated splicing factor 
[Homo sapiens] 


0.17 


275 


AF044171 


Homo sapiens cyclin- 
dependent kinase 
inhibitor 2D 
(CDKN2D) gene, 
partial cds 


0.59 


3925213 


(AL032626) Y37D8A.17 
[Caenorhabditis eleeans] 


3e-10 
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SEQ 
ID 


Nearest N< 
ACCESSION 


;ip.hbor(BlastN vs. Ge 
DESCKlr l Lvjr* 


nbank) 
P VAI TIF 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non-Redundant Pro 
DESCRIPTION 


teins) 

P VALUE 


276 


< 

t 
i 

L19640 


Saccharomyces 
:erevisiae cdc2/cdc28 
elated protein kinase 
jene. complete cds. 


0.59 


loom i c 


.Z81130) T23G11.9 

^,lCHUi IIAUUIUS tit^iuwi 


le-21 


277 


Z80999 


rluman DNA 
sequence from 
:osmid E140G5 on 
chromosome 22, 
complete sequence 
[Homo sapiens] 


0.58 


<NONE> 


<NONE> 


<NONE> 


278 


Y11108 


H.sapiens WNT8B 
gene 


0.58 


<NONE> 




<NONE> 


279 


U 80001 


Sphyraena idiastes 
lactate dehydrogenase 
A 


0.58 


<NONE> 


<NONE> 


<NONE> 


280 


Z49637 


S.cerevisiae 
chromosome X 
reading frame ORF 
YJR137C 


0.58 


<NONE> 


<NONE> 


<NONE> 


281 


X64467 


H.sapiens ALPS) 
gene for 

porphobilinogen 
synthase 


0.58 


<NONE> 




<NONE> 


282 


X74506 


G.gallus hox B3 
mRNA 


0.58 


<NONE> 


<NONE> 


<NONE> 


283 


U68040 


Cochliobolus 
heterostrophus 
polyketide svnthase 


0.58 


<NONE> 


<NONE> 


<NONE> 


284 


AF089084 


Arabidopsis thaliana 
putative auxin efflux 
carrier protein (PIN 1) 
mRNA. complete cds 


0.58 


<NONE> 


<NONE> 


<NONE> 


285 


U38481 


Rattus norvegicus 

nAi/ a l n hl mRNA 

complete cds 


0.58 


<NONE> 


<NONE> 


<NONE> 


286 


AFO 17656 


Homo sapiens G 
protein beta 5 subunit 
mRNA. complete cds 


0.58 


3236249 


(AC004684) hypothetical 
protein [Arabidopsis thalianal 


9.2 


287 


M96234 


Human glutathione 
transferase class mu 
number 4 . 


0.58 


1280073 


(U55366) Similar to cuticle 
collagen [Caenorhabditis 
elegans] 


7.1 


288 


AB002339 


Human mRNA for 
KIAA0341 gene, 
partial cds 


0.58 


861293 


(U28741) F35D2.1 gene 
product [Caenorhabditis 
elegans] 


7.1 



1 1*1 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


289 


U 11295 


Neisseria 
gonorrhoeae 
carbamoyl phosphate 
synthetase 
(glutamine) small 
subunit (carA) and 
large subunit (carB) 
genes, complete cds. 


0.58 


2425135 


(AF020283) DG2044 gene 
product [Dictyostelium 
discoideum] 


5.3 


290 


D80001 


Human mRNA for 
K1AA0179 gene, 
partial cds 


0.58 


4097223 


(U49836) gamma-glutamyl 
transpeptidase precursor [Brugia 
malayi] 


4.1 


291 


Z 11700 


Escherichia coli 
genes faeG. faeH. 
fael. faeJ and IS629- 
like insertion 
sequence. > :: 
cmbIZ 1171 0IECFAE 
H1J E.coli faeH, fael 
and faeJ genes 
encoding FaeH. Fael 
and FaeJ proteins 


0.58 


2347188 


(AC002338) laccase isolog 
fArabidopsis thalianal thaliana] 


3.2 


292 


IVl J fjjV 


Mouse hair keratin 
Al (MHKAl)gene, 

Lompicic 


0.58 


141163 


HYPOTHETICAL 8.3 KD 
PROTEIN >ai|62179 


3.2 


293 


X63787 


T.thermophila gene 
for snRNA U3-2 


0.58 


2826900 


(AB004461) DNA polymerase 
alpha catalytic subunit [Oryza 
sativa] 


3.1 


294 


D63881 


Human mRNA for 
KIAA0160 gene, 
partial cds 


0.58 


1934730 


(U95036) germin-like protein 
f Arabidopsis thaliana] 


3.1 


295 


U39378 


Oymnoc arena 
mexicana 16S 
ribosomal RNA gene, 
mitochondrial gene 
encoding 

mitochondrial RNA, 
partial sequence 


0.58 


2194131 


(AC002062) Similar to 
Svnechocystis antiviral protein 


3.1 


296 


X87987 


P.pastoris PRC 1 gene 

> :: 

dbj|E12103|E12I03 
DNA encoding 
precursor of protease 
from Pichia pastoris 


0.58 


3914197 


OCCLUDIN >gi| 1276983 
(U49221) occludin [Canis 
familiaris] 

>gi|1589l81|prfj|2210347D 
occludin fCanis familiaris] 


3.1 
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Nearest Neighbor ffllastX vs. Non-Redundant Proteins) 




\ 1 i 
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SEQ 
ID , 


Nearest Ne 
ACCESSION 


iehbor (BlastN vs. Ger 
DESCRIPTION 


bank) 
P VALUE 


Nearest Neighbor 
ACCESSION 


(BlastX vs. Non-Redundant Prot 
DESCRIPTION I 


:ins) 

» VALUE 




F 
c 
c 
t 


.hodococcus opacus 
hloromuconate 
ycloisomerase 
ransposase homolog 


0.58 1 


( 

3551821 i 


AF058803) mucin 4 [Homo 
.apiens] '. . 


0.041 


306 
307 


AF003948 j 
1 

f 

X99350 J 

— — — — — 

AJ234282 


-l.sapiens HFH4 
|ene. exon 1 and 


0.58 1 


137483 


VAV PROTO-ONC<JUtr>lb 
>gi|55221 (X64361) proto- 
pneogene [Mus musculus] 


0.024 


pined CDS 

-Tnmrt ^aniens mRNA 

'or Ig heavy chain 

variable region, clone 

C. 


0.58 


3264846 


(AC0O3682) R27945.2 [Homo 
sapiens] 


0.018 


308 


AF0793 10 


Vjfnc mucr*tiKi^ historic 
deacetylase 3 
(Hdac3) gene, exons 
4 through 15 and 
complete cds 


0.58 


1657601 


(U66220) unknown 
fNannocystis exedensl 


0.014 


309 




Human thiopurine 
methyltransferase 
(TPMT) gene, exons 
6 and / 


0.58 


3283352 


(AF063020) tens epithelium- 
derived growth factor [Homo 
sapiens! . 


0.011 


310 


AFO 19367 


M.tnusculus gene tor 
protein kinase C- 
gamma (exonl and 
exon 2) 


0.58 


1 1790878 


(U38291) microiubule- 
associated protein la [Homo 
sapiens] 


0.008 


311 


X65720 
AB011155 


riOmO Sapiens uuu»n 

for KIAA0583 
protein, partial cds 


0.58 


1 1351166 


SYNAPSINS IA AND IB 
>gj|1637l3 


0.006 


312 




H.sapiens mRNA for 


0.58 


1817548 


(D84307) phosphoethanolamine 
cytidylyltransferase [Homo 
sapiens] __ 


0.001 


313 


X63692 


DNA 

" Feline 
immunodeficiency 
virus isolate FIV- 
Pco336-8 pot 
polyprotein (pol) 


0.58 


1 2246532 


(U93872) ORF 73, contains 
lar»e complex repeat CR 73 


2e-05 


314 

315 


U53746 
K00436 


pene. partial cds 

Rattus norvegicus 
(clone rtl-U pseudo- 
Glv-tRNA sene. 


0.58 


j 206712 


(M64793) salivary proline-rich 
protein (Rattus norvegicus] 


le-05 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (B lastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















316 


S79632 


HSF2=heat shock 
factor 2 {alternatively 
spliced, splice 
junction region } 
[mice. CBA/J, testis. 
Genomic, 120 m. 
segment 2 of 3] 


0.58 


4038594 


(AJ222798) tDETl protein 
[Lycopersicon esculentum] 


3e-06 


317 


D43964 


Rat liver mRNA for 
Kan- 1 . complete cds 


0.58 


1280135 


(U55376J coded for by C 
elegans cDNA cm21e6; coded 
for by C. elegans cDNA 
cm01e2; similar to melibiose 
carrier protein 

(thiomethylgalactoside permease 

m 


le-08 


318 


AB 0079 18 


Homo sapiens mRNA 
for KIAA0449 
protein, partial cds 


0.58 


2833239 


EPIDERMAL GROWTH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|530823 (U 12535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


3e-13 


319 


AB 00 1466 


Homo sapiens mRNA 
for Efsl, complete 
cds 


0.58 


2943716 


(D45027) 25 kDa trypsin 
inhibitor [Homo sapiens) 


2e-14 


320 


Z11701 


Saccharomyces 
cerevisiae ERE1 gene 
for putative protein 
kinase. 


0.58 


3880115 


(Z81130) T23G11.9 
(Caenorhabditis elegans) 


9e-21 


321 


Z49535 


S.cerevisiae 
chromosome X 
reading frame ORF 
YJR035w 


0.58 


4106562 


(Z83819) dJ146H21.2 (similar 
to CYTOCHROME B-245 
HEAVY CHAIN) [Homo 
sapiens] 


3e-33 


322 


M62506 


S.cerevisiae DBF20 
gene, complete cds. 


0.57 


' <NONE> 


<NONE> 


<NONE> 


323 


X05944 


Yeast PSS gene for 
phosphaiidylserine 
synthetase 


0.57 


<NONE> 


<NONE> 


<NONE> 


324 


D38536 


Snail gene for ADP- 
ribosyl cyclase, 
complete cds 


0.57 


<NONE> 


<NONE> 


<NONE> 


325 


Z75004 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR096w 


0.57 


<NONE> 


<NONE> 


<NONE> 


326 


L77034 


Homo sapiens 
(subclone 10_el0 
from PI H16) DNA 
sequence. 


0.57 


<NONE> 


<NONE> 


<NONE> 



lb 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pro 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cyprinus carpio c- 










327 


D37887 


myc gene for c-Myc, 
complete cds 


0.57 


<N0NE> 


<NONE> 


<NONE> 


328 


ABO 14562 


Homo sapiens mRNA 
for KIAA0662 
irotein, partial cds 


0.57 


197406 


(M57576) Ig kappa chain [Mus 
musculus] 


8.9 


329 


Z69651 


Human uNa 
sequence from 
cosmid L75B9, 
Huntington's Disease 
Region, chromosome 
4pl6.3 


0.57 


1079280 


chaperonin containing TCP- 1 
complex gamma chain - African 
clawed frog >gi|793886 
(X84990) Ccti! 


8.9 


330 


D89285 


Mesocricerus auratus 
mRNA for inter-alpha 
trypsin inhibitor 
heavy chain 1, 
complete cds 


0.57 


134132 


RYANODINE RECEPTOR. 
SKELETAL MUSCLE 


6.9 


331 


Z48951 


S.cercvisiae 
cnrornudunic <^ v i 
cosmid 9723 


0.57 


4210432 


(AJ 130783) APC2 protein [Mus 
musculus] 


5.3 


332 


X95573 


A thaliana mRNA for 
salt-tolerance zinc 
fineer protein 


0.57 


1174828 


TYROSINE 
DECARBOXYLASE 2 
4. 1 . 1 .25) - parsley >gi| 1 6967 1 
(M96070) tyrosine 
decarboxylase rPetroselinum 




333 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.57 


' 465646 


PROBAjLc. ABC 
TRANSPORTER ATP- 
BINDING PROTEIN IN 
NTRA/RPON 5'REGION 
(ORF1) Azorhizobium 

■ • i * I i ■ TOO 

caulinodans >gi|3 1 1 388 
fX69959)ORPl 


4.0 


334 


AE001116 


Borrelia burgdorferi 
(section 2 of 70) of 
the complete eenome 


0.57 . 


2314735 


(AE000653) Na+/H+ antiporter 
(nhaA) [Helicobacter pylori 
26695] _ _ 


4.0 


335 


Z34291 


R.norvegicus mRNA 
for putative chloride 
channel. 


0.57 


1350832 


DNA-DtkkCl'LD UNA 
POLYMERASE I SECOND 
LARGEST SUB UNIT (RNA 
POLYMERASE I SUBUNTT 2) 
chain RPA2 - Euplotes 
octocarinatus (SGC9) 
><»i|578407 octocarinatus] 


3.0 
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.:-:'«VK[2 


Nearest Neighbor (BiaslN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(Z81063) similar to Actinin-type 




336 


D88255 


Homo sapiens A30 
Vk germline gene, 
partial cds 


0.57 


3875983 


actin-binding domain containing 
proteins [Caenorhabditis 
eleaans] 


3.0 


337 


AF037261 


Homo sapiens SH3- 
containing adaptor 
molecule- 1 mRNA, 
complete cds 


0.57 


1397341 


(U6lV3J> Similar to kMesln-UKe 
protein; coded for by C. elegans 
cDNA ykl84h5.3; coded for by 
C. elegans cDNA ykl84h5.5; 
coded for by C. elegans cDNA 
ykl3d7.3; coded tor by u. 
elegans cDNA ykl3d7.5; coded 
for by C. elegans cDNA 
yk31el.5; co... >gi|349354I 
(AF057567) kinesin-like protein 
ZEN-4a [Caenorhabditis 
eleaansl 


2.3 


338 


U26595 


Rattus norvegicus 
prostaglandin F2a 
receptor regulatory 
protein precursor, 
mRNA. complete cds 


0.57 


2773160 


(AF039656) neuronal tissue- 
enriched acidic protein [Homo 
sapiens] 


2.3 


339 


X69903 


R.norvegicus mRNA 
for interleukin 4 
receptor 


0.57 


2649193 


(AE001009) quinone-reactive 
Ni/Fe-hydrogenase B-type 
cytochrome subunit (hydC) 
[ Archaeoelobus fulaidus] 


1.8 


340 


Z74825 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOL083w 


0.57 


1458319 


(U64846) F47D2.5 gene 
product [Caenorhabditis 
elesans] 


1.4 


341 


AJ131469 


Foot-and-mouth 
disease virus O vpl 
gene, strain O/A/58 


0.57 


91206 


proline-rich protein - mouse 
(fragment) musculus] 


1.4 


342 


AF0U360 


Mus musculus 
regulator of G-protein 
signaling 7 (RGS7) 
mRNA. complete cds 


0.57 


542514 


selsolin - American lobster 


0.80 


343 


AFO 11360 


Mus musculus 
regulator of G-protein 
signaling 7 (RGS7) 
mRNA. complete cds 


0.57 


1078946 


gelsolin - American lobster 
>gi|452313 gelsolin [Homarus 
americanus] 


0.80 



US 
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mm- 

SEQ 
ID 


Nearest N< 
ACCESSION 


:iehbor (BlastN vs. Gei 
DESCRIPTION 


ibank) 
P VALUE 


Nearest Neighbo 
ACCESSION 


r (BlastX vs. Non- Redundant Pro 
DESCRIPTION 


eins) 

RVALUE 


344 1 


I 
i 

< 

L39210 


■lomo sapiens inosine 
nonophosphate 
dehydrogenase type II 
»ene. complete cds 


0.57 


559526 


X77466) 98.8kD polyprotein 
Strawberry latent ringspot 
virus] 


0.79 


345 


] 

U81523 


■luman endometrial 
bleeding associated 
factor mRNA, 
complete cds 


0.57 


211499 


(KOI 702) HMW/LMW collagen 
subunit precursor [Gallus gallus] 


0.79 


346 


U46561 


Tetrahymena 
thermophila 
polyubiquitin (TTU3) 
gene, complete cds. 
and RNA polymerase 
II subunit 2 (RPB2) 
gene, partial cds 


0.57 


2506493 


HYPOTHETICAL 100.5 KD 

doatttw r\l T AP.PYSH 

INTERGENIC REGION 
>ai|8S2654 (U29579) alternate 
gene name ygcB; (JKr_ioos 
rEscherichiacolil>ei|1789119 


0.60 


347 


X95543 


C.japonica mRNA for 
legumin (clone 
CjLes?3n 


0.57 


1709261 


NbUROHLAMkN'l WLU 

M FKU i ClXN v. 1 OU rviy 
NEUROFILAMENT 
PROTEIN) (NF-M) 
>gi|l083164|pir||S55395 
neurofilament protein M - raoou 
(fragment) >!>il854353 


0.46 


348 


1 Y 17282 


Homo sapiens mRNA 
for cvtokeratin tvpe II 


0.57 


3044086 


(AF055904) unknown 
[Mvxococcus xanthusl 


0.45 


349 


I X00716 


Frog mRNA fragment 
for alpha- A2- 
crystallin 


0.57 


3406654 


(AF079369) transcriptional 
repressor TUP1 [Dictyostelium 
discoideum] 


0.20 


350 


I X53238 


Klebsiella sp. 
bacteriophage Kl 1 
gene 1 for RNA 
polymerase 


0.57 


1228093 


(Z46913) polvketide synthase 


0.16 


351 


X99012 


H.sapiens FUS gene, 
exon 12 


0.57 


t> 243898 


(S78897) GOR=antigenic 
epitope (chimpanzees. Peptide, 
427 aal [Panl 


0.090 


352 


1 AL008711 


Human DNA 
sequence from PAC 
390N22 on 
chromosome Xp22.2 


0.57 


1469545 


(U53585) fibronectin attachmen 
protein [Mvcobacterium avium) 


t 

0.053 


353 


1 S74506 


SOX9 [human, fetal 
brain. Genomic, 149* 
nt. seement 3 of 3] 


1 

0.57 


1326350 


(U5S748) similar to potential 
transmembrane domains in S. 
ccrevisiae nulcear division 
RFT1 protein rSP:P38206) 


0.017 



)7<f 
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Nearrst Neighbor fBlastN vs. Genbank) 


Nearest Neiehbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 




DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















354 


D25542 


Human mRNA for 
golgi antigen gcp372, 
complete cds 


0.57 


4063399 


(AF 102575) cell surface protein 
DTFA [Dictyostelium 
discoideum] 


0.005 


355 


AB015426 


Mus musculus mRNA 
For aipnai,j- 
fucosyltransferase IX, 
complete cds 


0.57 


2661842 


(Y15732) DNA polymerase beta 
(Xenopus laevis] 


7e-U 


356 


X51394 


Aenopus mivi ' ^ 
APEG protein, 
containing a highly 
repetitive amino acid 
sequence 


0.57 


1929056 


(Y 12090) putative 3,4- 
dihydroxy-2-butanone kinase 
fLvcoDersicon esculentuml 


9e-12 


357 


AB007918 


Homo sapiens mRNA 
for KIAA0449 
protein, partial cds 


U.J / 




"EP'lDEft.iVlAL GROWTH 
FACTOR RECEPTOR 
KINASE SUBSTRATE EPS 8 
>gi|5 30823 (UI2535) epidermal 
growth factor receptor kinase 
substrate [Homo sapiens] 


3e-13 


358 


AB001466 


Homo sapiens mRNA 
for Efsl, complete 
cds 


0.57 


2943716 


(D45027) 25 kDa trypsin 
inhibitor [Homo sapiens] 


2e-14 


359 


Y00760 


Rabbit mRNA for 
adult fast skeletal 
troponin-C 


0.57 


2576348 


(AC002400) Glutamyl tRNA 
synthetase [Homo sapiens] 


2e-28 


360 


X95153 


H.sapiens brca2 gene 
exon 3 > :: 

emb|Ao2/ /8|Ao_f '8 
Sequence 19 from 
Patent WO9719110 


0.57 


3419847 


(AC0O4982) similar to yeast 
hypothetical protein ybk4; 
similar to P38164 
(PID:e586461) [Homo sapiens] 


2e-55 


361 


X85967 


B. vulgaris mRNA for 
betavulgin 


0.56 


<NONE> 


<NONE> 


<NONE> 


362 


U09251 


Mycoplasma 
genital iurn DNA 
gyrase subunit B 
complete cds, DNA 
polymerase III beta 
subunit (dnaN) and 
seryl-tRNA 
synthetase (serS) 
eenes. partial cds. 


0.56 


<NONE> 


<NONE> 


<NONE> 


363 


V00158 


Chloroplast Euglena 
gracilis genes coding 
for transfer RNAs 
specific for threonine 
glycine, methionine, 
serine and elutamine. 


0.56 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pit 


)teins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Clostridium 










364 


D88I51 


perfringens DNA tor 
D-alanine:D- alanine 
ligase. cortical 
fragmcnt-lytic 
enzyme 


0.56 


<NONE> 


<NONE> 


<NONE> 


365 


U67478 


Methanococcus 
jannaschii section 20 
of 150 of the 
complete eenome 


0.56 


<NONE> 


<NONE> 


<NONE> 


366 


L23800 


Tachyglossus 
aculeatus beta-globin 
homolog (HBB) 
gene, complete cds 


0.56 


<NONE> 


<NONE> 




367 


AB01 1 129 


Homo sapiens mRNA 
for KIAA0557 
protein, partial cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


368 


L77034 


Homo sapiens 
(subclone 10_el0 
from PI HI 6) DNA 
sequence. 


056 


<NONE> 


<NONE> 


<jy \J in n.> 


369 


Z47202 


C.albicans gene for 
TFIIIB (BRF1) 
subunit. 


0.56 


<NONE> 


<NONE> 


<NONE> 


370 


U53868 


Clostridium 
acetobutylicum 
mannitol-specific 
phosphotransferase 
system (PTS) system, 
mtlA. mtlR. mtlF. and 
mtlD genes, complete 
cds 


0.56 


' <NONE> 


<NONE> 


<NONE> 


371 


AF041259 


Homo sapiens breast 
cancer putative 
transcription factor 
(ZABC1) mRNA, 
complete cds 


0.56 


<NONE> 


<NONE> 


<NONE> 


372 


L42636 


Plasmodium 
falciparum variant- 
specific surface 
protein (var-7) 
mRNA. complete cds. 


0.56 


2213557 


(Z97052) hypothetical protein 


8.8 
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mm 


Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















373 


U96180 


Human protein 
tyrosine phosphatase 
(TEP1) mRNA, 
complete cds 


0.56 


731016 


THIOREDOXIN REDUCTASE 
thioredoxin reductase (NADPH) 
fCoxielia burnetii] 


8.7 


374 


L76259 


Homo sapiens PTS 
gene, complete cds 


0.56 


2369863 


(Y12225)Spi-l/PU.l 
transcription factor 


6.7 


375 


AF045946 


Mus musculus 
D16Jhul7 YAC 
98B3 acentric end, 
partial sequence 


0.56 


2130017 


hypothetical protein - common 
sunflower protein [Helianthus 
annuus] 


5.1 


376 


X97986 


M. musculus mRNA 
for desmocollin type 
1 


056 


4038031 


(AC005936) hypothetical 
protein [Arabidopsis thaliana] 


3.9 


377 


X79437 


M.musculus whey 
acidic protein (WAP) 
gene, exon 1 


0.56 


549670 


Spindle pole hud* 

COMPONENT SPC42 yeast 
(Saccharomyces cerevisiae) 
>gi|486054 (Z28042) ORE 
YKL042w [Saccharomyces 
cerevisiae] >gi|666098 
(X71621) hypothetical 42.3 kD 
protein [Saccharomyces 
cerevisiae) 


3.9 


378 


M27902 


Rat cardiac specific 
sodium channel alpha- 
subunit mRNA, 
complete cds. 


0.56 


585234 


ENDOGLUCANASE G 
PRECURSOR 3.2. 1.-) CelCCG 
precursor - Clostridium 
cellulolyticum cellulolyticum] 


3.9 


379 


AF036696 


Caenorhabditis 
elegans cosmid 
F15B10 


0.56 


546071 


gp70=envelope protein 

{ endogenous provirus } host=cat 

lymphoid tissues. Peptide. 445 

aa] 


3.6 


380 


299102 


Caenorhabditis 
elegans cosmid 
B0331, complete 
sequence 
[Caenorhabditis 
eleeans] 


0.56 


603664 


transcriptase; ORF2; encodes aa 
motifs conserved in reverse 
transcriptases; most closely 
related reverse transcriptases are 
those of non-LTR 
retrotransposons. The 3' 901 bp 
of this CDS are identical to the 
3-901 bp... 


3.0 


381 


L27850 


Equus caballus (clone 
T131)T-cell receptor 
DNA. V-reaion. 


0.56 


1079150 


transcription factor shn - fruit fly 


1.7 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pre 


)teins) 


SEQ 
rr* 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYPOTHETICAL 113.1 KD 




382 


X97986 


M.musculus mRNA 
for desmocollin type 
1 


0.56 


2497227 


PROTEIN IN PRE5-FET4 
INTERGENIC REGION 
>si|1072409 (Z54141) unknown 


1.7 


383 


AF087455 


Didelphis virginiana 
G protein receptor 
kinase 2 mRNA. 
complete cds 


0.56 


1213453 


(U12964) contains ankyrin-like 
repeats; similar to human 
desmoplakin repeat region 
[Caenorhabditis elegansl 


1.3 


384 


D8001 1 


Human mRNA for 
KIAA0189 gene, 
complete cds 


0.56 


226535 


protease [Hepatitis B virus] 


l.l 


385 


AJ002272 


Mus musculus mRNA 
for HAP1 - A protein, 
3' region 


0.56 


3327158 


(AB014572) KIAA0672 protein 
[Homo sapiens] 


1.0 


386 


L39210 


Homo sapiens inosine 
monophosphate 
dehydrogenase type II 
gene, complete cds 


0.56 


628431 


coat protein - strawberry latent 
rtngspot virus 


0.77 


387 


X02770 


Mouse Thy- 1.2 gene 
5' untranslated region 
and exon 1 


0.56 


3327046 


(.VB014516) KIAA0616 protein 
[Homo sapiens] 


0.59 


388 


AF038575 


Schizosaccharomyces 
pombe Wiskott- 
Aldrich Syndrome 
protein homolog 
(wspl+) gene, 
complete cds, and 
BTF3/beta-NAC 
?cne. partial sequence 


0.56 


88466 


salivary proline-rich 
phosphoprotein precursor PRH1 
(allele PIF) - human >gi| 190484 
(K03203) prepro salivary 
proline-rich protein [Homo 
sapiens] >gi 1190512 


0.35 


389 


X56747 


Rat mRNA for fetal 
intestinal lactase- 
phlorizin hydrolase 
precursor, partial 


0.56 


2072742 


(Z48674) chitinase homologue 
(Sesbania rostrata] 


0.23 


390 


Y 12072 


G.arboreum mRNA 
for famesyl 
pyrophosphate 
synthase 


0.56 


296670 


(X07882) Po protein [Homo 
sapiens] 


0.20 


391 


S75756 


pl5=cyclin D- 
dependent kinases A 
and 6-binding 
protein/pl5 product 
[ exon/intron 1 } 
[human. br3in tumors. 
Genomic. 753 nt] 


0.56 


1082743 


protein kinase (EC 2.7.1.37) 
SPRK - human sapiens] 
>gi|1090771|prfl|2019437A 
protein Tvr kinase I 


0.15- 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Equus caballus type 










392 


U62528 


II collagen tnRNA. 
complete cds 


0.56 


461671 


[Segment 1 of 2 J COLLAGEN 
.ALPHA HI) CHAIN 


0.030 


393 


X96877 


C.reinhardtii mRNA 
for unknown lumenal 
polypeptide 


0.56 


3341678 


(AC003672) putative zinc fingei 
protein [Arabidopsis thaliana] 


5e-09 


394 


S78788 


ctiAlAo [chickens, 
liver. Genomic, 979 
nt, segment 4 of 4] 


0.56 


2661590 


(AL009196) 1- 

evidence=predicted by content; 
l-method=genefinder;084; 1- 
method_score=59.41; 1- 
evidence_end; 2- 
evidence=predicted by match; 2- 
match_accession=AA950019; 2- 
match_description=LD29959.5p 
rime LD Drosophila 
melanosas... 


2e-ll 


395 


AF006640 


Drosophila 
melanogaster Ste20- 
like protein kinase 
mRNA. complete cds 


0.56 


' 1109830 


(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
Caenorhabditis elegans] 


6e-12 | 


396 


AF006640 


Drosophila 
melanogaster Stc20- 
like protein kinase 
mRNA, complete cds 


0.56 


1109830 


(U41534) coded for by C. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis eleeans] 


4e-13 


397 


AE000716 


Aquifex aeolicus 
section 48 of 109 of 
the complete genome 


0.56 


3688350 


(AJL030996.) dJ1189B24.4 
(novel PUTATIVE protein 
similar to hypothetical proteins 
S. pombe C22F3.14C and C. 
ciegans l lunj.oj [ziomo 
sapiens] 


3e-66 


398 


Z36079 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR210w 


0.55 


<NONE> 


<NONE> 


<NONE> 


399 


Y 17267 


vlus musculus mRNA 
'or ubiquitin 
ronjugaune enzvme 


0.55 


<N0NE> 


<NONE> 


<NONE> 


400 


■ 

AC001461 


Homo sapiens 
subclone 2_g5 from 
BAC H107) DNA 
sequence 


0.55 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Alouatia seniculus 










401 


AFO 19079 


susceptibility 
(BRCA1) gene, 
partial cds 


0.55 


<NONE> 


<NONE> 


<NONE> 


402 


M90058 


Human serglycin 
gene, exons 1,2, and 
3. 


0.55 


<NONE> 


<NONE> 


<NONE> 


403 


ABO 13469 


Mus musculus CLM2 
gene for cytohesin 2, 
complete and partial 
cds, alternative 
splicing 


0.55 


1729760 


(Z68152) chitinase [Gossypium 
hirsutum] 


8.6 


404 


AJ011592 


Bacteriophage PI ban 
gene 


0.55 


2493689 


PHOTOS YSTEM II 10 KD 
PHOSPHOPROTEIN deltoides] 
>si|2 143326|gnI|PID|e3 19090 
(Y13328) lOkDa 
phosphoprotein [Populus 
deltoides] 


6.6 


405 


Z15118 


T.brucei kinetoplast 
maxicircle variable 
region DNA 


0.55 


2970432 


(AF049I32) NADH 
dehydrogenase subunit 5 
Floromctra serratissima] 


6.5 


406 


7AQQC ] 


S.cerevisiae 
chromosome XVI 
cosmid 9723 


0.55 


4210432 


(AJ 130783) APC2 protein [Mus 
musculusl 


4.9 


407 


U78726 


Homo sapiens mad 
protein homolog 
Smad2 gene, 
promoter, exon la 
and exon lb 


0.55 


3319290 


(.AF055994) thyroid hormone 
receptor-associated protein 
complex component TRAP220 
'Homo sapiens] 


4.9 


408 


AG001389 ' 


femo sapiens 
genomic DNA, 21 q 
■egion, clone: 
5HllBm42 


0.55 


125684 


KRUEPPEL PROTEIN 
>gi|72899|pir||TWFF Krueppel 
gap protein - fruit fly . 
Drosophila sp.) melanogaster] 
>gi|224875|prf||1202348A 
■Crueppei sene 


3.8 


409 


I 
r 

s 

M27640 f 


Plasmodium vivax 
najor blood stage 
urface antigen gene, 
tartial cds. 


0.55 


( 

r 
1 

549453 t 


•C-LINKED PEST- 
rONTAINING 
FRANSPORTER transporter - 
uman >gi|458255 (U05321) X- 
inked PEST-containing 
ransporter [Homo sapiens] 


3.8 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



Fugu rubripes mRNA | 



410 I D37977 



for sodium channel 
alpha subunit, partial 
cds 



0.55 



411 I M88505 



Ostertagia ostertagi 
cathepsin B-Iike 
cysteine protease 
gene, panial cds. 



1435038 



0.55 



3941277 



(D38024) ORF [Homo sapiens! 



(AF000900) p45 [Rattus 
norvegicus) 



3.7 



412 | U95098. 



Xenopus laevis 
mitotic 
phosphoprotein 44 
mRNA. partial cds 



0.55 



413 | U89241 



Human mibp gene, 
anial cds 



2570154 



(AB008376) 17-kDaPKC- 
Ipotentiated inhibitory protein of 
|PP1 [Sus scrofal ' 



X< 



414 | AF027151 



415 I AF006821 



416 | Y12736 



enopus laevis 
survival of motor 
neuron protein 
interacting protein 1 
(SIP I) mRNA. 
complete cds 



(U62253) 16kDa secretory 
0 55 | 4097465 Iprotein fSus scrofal 



0.55 



Bufo marinus 
natriuretic peptide 
receptor C mRNA, 
panial cds 



4007790 



(AL034463) putative single- 
strand polynucleotide binding 
protein [Schizosaccharomyces 
Ipombe] 



Lactococcus lactis 
cremoris plasmid 
pJW565 DNA. 
HabiiM. llabiiR genes | 
and orfX 



(Z97343) GTP-binding RAB2A 
0.55 | 2245075 Iprotein 



0.55 



3386334 



(AF035120) type I procollagen 
(pro-alpha 2 chain [Canis 
|familiaris] 



2.8 
2.2 



1.7 



1.7 



417 I U38307 



vTus musculus 
collagen alpha- 1 type j 
1 gene. 5' flanking 
region, partial 
sequence. 



gastric mucin - human 
0.55 | 1362802 Kfragmcnt) >si|547517 



1.3 
1.3 



418 | D 13473 



Mouse mRNA for 
Rad51 protein 



419 I AF045238 



iungarus fasciatus 
acetylcholinesterase 
gene, alternatively 
spliced products, 
martial cds 



0.55 



1374698 



KD83032) nuclear protein, 
NP220 [Homo sapiens] 



(Z94752) hypothetical protein 
0.55 I 3261734 Rvi004c 



420 I AE000795 



vlethanobactenum 
thermoautotrophicum 
from bases 1 to 
1020S (section I of 
148) of the complete 
genome 



(M94131) mucin [Homo 
055 1 186396 [sapiens] 



0.97 
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=1 Nearest Neiuhhnr iRIncrNf 




SEC 
ID 


l\ 

Iaccessioi 


M DESCRIPTION | P VALUE 
Y.lipolytica SEC62 | 


nearest rsei^j 
ACCESSION 


Dor (Blast* vs. Non-Redundant 1 

DESCRIPTION 
(ZS1068)F25H5.2 


Proteins) J 

p value! 


421 
422 


I X99537 

1 1708147 


gene 

Aquilegia sp. 
phytochromc 

CPHYB/D^ ornr 

partial cds. 


I U.JJ 

0.55 


I 3876397 
| 2338024 


[Caenorhabditis elegans] 

(AF005370) ribonucleotide- 
reductase, large subunit 


0.58 1 
0.57 1 


423 


1 Z56586 


H.sapiens CpG DNA 
clone I2c8, reverse 
read cpe!2c8.rtld . 


0.55 


1 3320122 


(U46007) espin [Rattus 
_ norvegicus] 


0.44 1 


424 




Mus musculus 

phosphate 
amidotransferase 
(GFAT) gene. 5' 
region and partial cds 


I 0.55 


' 282600 


hypothetical protein - 
Mycoplasma hyorhinis 


0.43 1 


425 


1. K02298 


Rat chymotrypsin B 
(chyB) gene, 
complete cds. 


0.55 


3413810 


(Y17034) Bassoon [Mus 
musculus] 


0.33 1 


426 


X84792 


M. musculus clusterin 
gene 


0.55 


1652475 


(D90905) hypothetical protein 


0.25 I 


427 


U00185 


C nnrn .if^Q.TQmc 

Saanen and Weisse 
Edel breeds DR beta- 
chain antigen binding 

uumain, ivtri^ CloSS 111 

DRB 


055 


2507136 


SUBTH.IN BIOSYNTHESIS 
PROTEIN SPAB 


0.19 1 


428 1 


Z54946 


H.sapiens CpG DNA.I 
:lone 178al2, reverse! 
•ead cpgl78al2.rtla . 


0.55 1 


( 

807646 


Ml 7294) unknown protein 
Human herpesvirus 4] 


0.065 1 


429 1 


( 
c 
c 

i 

AF03165O r 


Dryctolagus 1 
uniculus anion 1 
xchanger 3 brain 
soform (AE3) 
nRNA. complete cds | 


0.55 


( 

1778210 f 


U68412) fibrillar collagen 
Arenicola marina] 


0.044 I 


430 1 


£ 
c 

M25579 n 


tovine adenylyl 1 
yclase Type I 1 
iRNA. complete cds.| 


0.55 


( 
h 

2649040 h 


AE000997) conserved 
ypothetical protein 
\rchaeoslobus fulgidus] 


0.023 1 


431 | 


h 

Z48796 rr 


[.sapiens Ski-W 1 
iRNA for helicase | 


0.55 1 


0 

330452 [I 


A 14708) DNA polymerase 
■luman cytomegalovirus] 


0.023 1 
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iiSMl Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor iBlastX vs. Non-Redundant Proteins) 



SEQ 

DP | ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE I 



432 



M80234 



Cow dopamine 
transporter mRNA, 
putative cds. 



433 I U91616 



Human I kappa B 
cpsilon (IkBe) 
mRNA, complete cds 



0.55 



3874972 



similar to hlongation 
[factor Tu family (contains 
|ATP/GTP binding P-loop); 
JcDNA EST EMBL:D76223 
Icomes from this gene; cDNA 
EST yk478c5.5 comes from this 
gene [Caenorhabditis elegans 1 



0.55 



3875577 



l(Z68314) similar to G-protein; 
IcDNA EST EMBL:C 1 1959 
comes from this gene; cDNA 
EST EMBL:C 10341 comes 
Ifrom this gene; cDNA EST 
Iyk494e4.3 comes from this 
Igene; cDNA EST yk448a8.5 
|comes from this gene comes 
Ifrom this gene; cDNA EST 
IemBL:C10341 comes from this 
gene; cDNA EST yk494e4.3 
comes from this gene; cDNA 
EST yk448a8.5 comes from this 
Igene [Caenorhabditis elegans] 
>gi|3S80364|gnl|PID|el349948 

[(Z830I6) similar to G-protein; 

IcDNA EST EMBL.C11959 
comes from this gene; cDNA 
EST EMBL:C 10341 comes 

Ifrom this gene; cDNA EST 

Iyk494e4.3 comes from this 
gene; cDNA EST yk448a8.5 

Icomes from this gene 

IfCacnorhabditis elegansl 



4e-04 



7e-06 



434 J D1Q910 



Arabldopsis thaiiana 
Atpk7 gene for 
serine/threonine 
protein kinase, 
complete cds 



.435 



436 



L22013 



winepox virus 
complete ORFS 
C20L-C1L > :: 
gb|I58297|I58297 
Sequence 14 from 
patent US 5651972 



0.55 



3876072 



(Z81505) Similarity to 
Metanococcus hypothetical 
(protein 0682 (TR:Q58095) 
IfCuenorhabditis elegans) 



0.54 



4e-42 



<NONE> 



<NONE> 



<NONE> 



Z92653 



Human 

immunodeficiency 
virus type 1 env gene 



0.54 



<NONE> 



<NONE> 



<NONE>| 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proreins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



437 



KOI 992 



438 



AE001415 



ACCESSION 



fc.con pnospnate- 
repressible 
pcriplasmic 
phosphate-binding 
protein (phoS), 
peripheral membrane 
proteins (pstC. pstB 
and phoU) and 
integral membrane 
protein (pstA) genes 
complete cds. 



Plasmodium 
falciparum 
chromosome 2, 
section 52 of 73 of 
the complete 
sequence 



0.54 



0.54 



DESCRIPTION 



P value! 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



439 



AF064030 



Helianthus tuberosus 
lectin 2 mRNA, 
complete cds 



0.54 



<NONE> 



440 



X12591 



E.coli plasmid DNA 
for colicin E9 



<NONE> 



<NONE> 



0.54 



<NONE> 



<NONE> 



<NONE> 



441 



U73679 



Caenorhabditis 
elegans YNKI-a 
mRNA. complete cds 



0.54 



<NONE> 



<NONE> 



<NONE> 



442 



Z93990 



Unidentified 
bacterium DNA for 
16S ribosomai RNA 



0.54 



<NONE> 



443 



X85967 



B. vulgaris mRNA for 
betavulgin 



<NONE> 



<NONE> 



0.54 



(Z37980) ORF12 [Escherichia 
J57836 Icon] 



8.3 



444 



U76524 



Sambucus nigra 

bosome inactivating 
protein precursor 
mRNA. complete cds 



0.54 



(M80653) tetraheme 
151377 [Pseudomonas stutzeri] 



6.2 



445 



X7180O 



H.sapiens gene for 5S 
rRNA (640 bp) > 
emb|X71801|HS5SR6 
40B H.sapiens gene 
5S rRNA C640 bp) 



0.54 



446 



U89241 



Human mibp gene, 
partial cds 



(AE001216) T. pallidum 
3322653 predicted coding region TP0369 



2.7 



0.54 



(b'62253) 16kDa secretory 
4097465 Iprotein [Sus scrofa] 



2.2 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















447 


L16013 


Rattus norvegicus Q- 
like gene sequence 


0.54 


3087760 


(AJ005583) p75 protein 
[Crypihecodinium cohnii] 


0.95 


448 


U60275 


Capra hircus skeletal 
muscle voltage-gated 
chloride channel 
gClC-1 mRNA, 
partial cds 


0.54 


1781344 


t t lUHja) riouo poiyKetiae 
svnthase 


0.95 


449 


U36795 


Myxococcus xanthus 
rfbABC O-antigen 
biosynthesis operon. 
rfbA, rfbB, and rfbC 
genes, complete cds. 


0.54 


3877232 


(Z8 1540) predicted using 
Genefinder 


0.74 


450 


AF05309I 


Drosophila 
melanogaster eyelid 

complete cds 


0.54 


2144110 


zinc finger protein RTZ - rat 
>2i|949996 


0.14 


451 


V00602 


Genome of the 
bacteriophage fd 
(Inoviridae). 


0.54 


2661620 


(AL009197) hypothetical 
protein 


0.11 


452 


U60800 


Human semaphorin 
(CD 100) mRNA, 
complete cds 


0.54 


125682 


kkkAllK ULI'KA HIUH- 
SULFUR MATRIX PROTEIN 
(LfHS KERATIN) 
>gi[iuvi LO|pir||AJOooo ultra- 
high-sulfur keratin - sheep 
>gi|1306 (X55294) ultra high- 
sulphur keratin protein [Ovis 
aries) 


0.003 


453 


XS5969 . 


S.coelicolor secD. 
>ecF & apt genes 


0.54 


( 
1 

3874972 < 


(Z99709) similar to Elongation 
: actor Tu family (contains 
ATP/GTP binding P-loop); 
:DNA EST EMBL:D76223 
:omes from this gene; cDNA 
EST yk478c5.5 comes from this 
»ene fCaenorhabditis elegans] 


7e-06 


454 


1 
1 

Y08265 r 


^.sapiens mRNA for 
DAN26 protein, 
lartial 


0.54 


( 
r 
t 

3875131 I 


Z70750) similar to vanadate 
esistance protein 
ransmembranous domains 
Caenorhabdius elegans] 


5c- 12 



I Si 
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? Nearest 


Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant ProteinO 1 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 
riydromantes 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


455 


U89613 


platycephaius 
cytochrome b (cytb) 
gene, mitochondrial 
gene encoding 
mitochondrial 
protein, partial cds 


0.53 


<NONE> 


<NONE> 


<NONE>| 


456 


AF034597 


Habrobracon hebetor 
cytochrome oxidase 
II gene, partial cds; 
and tRNA-Asp, tRNA 
His, and tRNA-Lys 
genes, complete 
sequence, 

mitochondrial genes 
for mitochondrial 
products 


0.53 


<NONE> 


<NONE> 


<NONE> I 


457 


K02653 


Yeast (S.cerevisiae) 
tau repetitive element 
and Cvs-tRNA. 


0.53 


<NONE> 


<NONE> 




458 


X53416 


Human mRNA for 
actin-binding protein 
(filamin) 


0.53 


2134839 


bullous pemphigoid antigen 2 - 
human 


6.2 I 


459 


( 

i 
t 

< 
c 

M55545 c 


Drosophila 
subobscura atchohol 
dehydrogenase (Adh) 
>ene, and alchohol 
lehydrogenase (Adh- 
lup) gene, complete 
ds's. 


0.53 


t 

' 2136865 


lair keratin cysteine rich protein 
sheep 


2.1 1 
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mm 


Nearest Neighbor (BlastN vs. Cenbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















460 


U 19362 


MeUianobanenum 

thermoautotrophicum 

methylene- 

tetrahydromethanopte 

rin dehydrogenase 

(mtd), 

imidazoleglycerol- 
phosphace 
dehydrogenase 
(hisB), and putative 
ferredoxin (fdxA) 
genes, complete cds, 
orf9 gene, partial cds, 
orfs ... 


0.53 


731969 


HYPOTHETICAL 91.6 KD 
PROTEIN IN HXT8-CRT1 
INTERGENIC REGION 
>gi|1078261|pLr||S50773 

pruoauic die inur tUic prutcin 

YJL212c- yeast 
(Saccharomyces cerevisiae) 
>gi|496950 (Z34098) ORF 
[Saccharomyces cerevisiae] 
>gi| 1015596 (249487) ORF 
YJL212C 


0.54 


461 


AB011527 


Rattus norvegicus 
mRNAfor MEGFI, 
complete cds 


0.53 


417037 


GERM CELL-LESS PROTEIN 
Fruit fly (Drosophila 
melanogaster) >gi| 157490 
(M97933) germ cell-less protein 
[Drosophila melanoeaster] 


3e-06 


462 


U64313 


Bacillus firmus MsyB 
gene, 5" upstream 
region and partial cds 


0.52 


<NONE> 


<NONE> 


<NONE> 


463 


AF008590 


Caenorhabditis 
elegans paraquat 
responsive protein 
(CePqM132) mRNA, 
complete cds 


0.52 


<NONE> 


<NONE> 


<NONE> 


464 


L 10245 


Mus saxicola 
spermidine/spermine 
N 1 -acetyltransferase 
(SSAT) gene, 
complete cds. 


0.52 


<NONE> 


<NONE> 


<NONE> 


465 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.52 


124263 


INiJULlN-UKJbGROWlH 
FACTOR IB PRECURSOR 
(IGF-IB) (SOMATOMEDIN C) 
>gi|69361|pir||IGHUlB insulin- 
like growth factor IB precursor - 
human prepropeptide [Homo 
sapiens] 


7.7 
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Nearest Neiohbor (BlaslN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pro 


terns) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cacnorhabditis 










466 


AL021066 


elegans cosmid 
H31B20. complete 
sequence 
Caenorhabditis 
eleeans] 


0.52 


2589162 


(D88451) aldehyde oxidase [Zea 
mays] 


f\ ft 


467 


AF038588 


Porphyra linearis 18S 
ribosomal RNA gene, 
3' partial sequence 


0.52 


1055055 


(U39ibO) coded tor oy C. 
elegans cDNA yk37gl.5; coded 
for by C. elegans cDNA 
yk5c9.5; coded for by C. 
elegans cDNA ykla9.5; 
alternatively spliced form of 
F52C9.8b 


4.6 


468 


AE001125 


Borrelia burgdorferi 
(section 1 1 of 70) of 
the complete eenome 


0.52 


4115827 


(AB021287) polyprotetn 
[Hepatitis O virus] 


2.0 


469 


AF006640 


Drosophila 
melanogaster Ste20- 
like protein kinase 
mRNA. complete cds 


0.52 


1109830 


(U41534) coded for by (J. 
elegans cDNA CEESI42F; 
Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegans] 


0.002 


470 


U90177 


Aplysia calitomica 
ubiquitin carboxyl- 
terminal hydrolase 
(Ap-uch) mRNA. 
complete cds 


0.51 


<NONE> 


<NONE> 


<NONE> 


471 


Z28304 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKR079C 


0.51 


<NONE> 


<NONE> 


<NONE> 


472 


Z92837 


Caenorhabditis 
elegans cosmid 
R03E1. complete 
sequence 
[Caenorhabditis 
eleeans] 


0.51 


123506 


HYDROPHOBIC SEED 
PROTEIN (HPS) 


7.6 


473 


D13803 


Mouse mRNA for 
RecA-like protein 
MmRad51. complete 
cds 


0.51 


3327228 


(ABO 14607) KIAA0707 protein 
[Homo sapiens] 


4.5 


474 


X07187 


Pea hsp2 1 mRNA 


0.51 


3328678 


(AE001299) hypothetical 
protein [Chlamydia trachomatis] 


4.4 
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Nearest 


Neighbor fBlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant ProteirO 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






LLAA 1 /enhancer- 










475 


S63168 


binding protein 
delta=transcription 
factor CRP3 homolog 
[human, prostate 
carcinoma cell line 
LNCaP, Genomic, 
1594 nt] 


0.51 


1653215 


(D9091 1) apolipo protein N- 
acyltransferase [Synechocystis 
sp.] 


1.2 


476 


U67078 


Xenopus laevis C2- 
HC type zinc finger 
protein X-MyTl 


U.J 1 


3850320 


(AF067520) PITSLRE protein 
kinase beta SV2 isoform [Homo 
sapiens] 


0.17 


477 


L38933 


Homo sapiens GT198 
mRNA. complete 
ORF 


0.51 


3219965 


HYPOTHETICAL 100.6 KD 
TRP-ASP REPEATS 
CONTAINING PROTEIN 
C2C6.04C IN CHROMOSOME 
I 


U.Ujy 


478 


AF001000 


Lycopersicon 
esculentum 
polvsalacturonase 1 


0.50 


<NONE> 


<NONE> 


<NONE> 


479 


228304 


S.cerevisiae 
chromosome XI 
reading frame ORF 
YKR079c 


0.50 


<NONE> 


<NONE> 


<NONE> 


480 


X97225 


Oncorhynchus keta 
GF-II gene 


0.50 


<NONE> 


<NONE> 


<NONE> 


481 


] 
( 

AJ001388 i 


►iomo Sapiens, RP58 
:DNA for complete 
nRNA 


0.50 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 
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i Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Proteins* 


SEQ 
ID 


ACCESSIOI> 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo Sapiens. RP5£ 










481 


AJ001388 


cHNA frtr f*r»mnlpfi» 

mRNA 


0.50 


<NONE> 


<NONE> 


<NONE> 


482 


M86626 


P.occultum 23S 
ribosomal RNA, 
partial cds. 


0.50 


<NONE> 


<NONE> 


<NONE> 


483 


U76523 


Sambucus nigra lectir 
precursor mRNA, 
complete cds 


l 

0.50 


1722856 


CHROMOSOME ASSEMBLY 
PROTEIN XCAP-E African 
clawed frog >gi|563814 
(U13674) XCAP-E [Xenopus 
laevis] 


3.2 


484 


AF031663 


Mus musculus striatin 
mRNA, complete cds 


0.50 


179521 


(M63730) BPAG2 [Homo 
sapiens] 


3.2 


485 


U32729 


Haemophilus 
influenzae Rd section 
44 of 163 of the 
complete genome 


0.50 


3875699 


(Z92829) F10A3.15 
Caenorhabditis eleeans] 


0.65 


486 


AF067198 


Dictyosteiium 

f~l i crAin* 1 ! i yn 1 r~\ rt ^ 

uiacuiucLirii Liunc 

9.10Tdd-3 and RED 
repetitive elements, 
partial sequence 


0.50 


2494740 


HYPOTHETICAL 28.3 KD 
PROTEIN IN GBD 5"REGION 
(ORF4) >gi|2120954|pir||I39562 
ORF4 - Alcaligenes eutrophus 
>gi 1695274 (L36817) ORF4 


0.008 


487 


M23442 


Human interleukin 4 
(IL-4) gene, complete 
cds. 


0.49 


<NONE> 


<NONE> 


cNONE> 


488 


U16367 


Caenorhabditis 

eicgans ruu 

tiomeobox protein 
CEH-18 (ceh-18) 
mRNA, complete cds. 


0.47 


3786409 


(AF098499) contains similarity 
to Saccharomyces cerevisiae 
MAPI protein (GB.U19492) 
[Caenorhabditis elegans] 


8.9 


489 


AF001000 


Lycopersicon 
;sculentum 
jolygalacturonase I 


0.45 


<NONE> 


<NONE> 


<NONE> 


490 


( 

Z 1 8920 s 


Yersinia 

rnterocolitica wbb 
>ene cluster 


0.41 


<NONE> 


<NONE> 


<NONE> 


491 


1 
I 

D86983 c 


-iuman mRNA for 
CIAA0230 gene, 
lartial cds 


0.35 


( 

206712 f 


M64793) salivary proline-rich 
rotein [Ratrus norvegicus] 


4e-05 


492 


I 
1 

AF064030 c 


lelianthus tuberosus 
:ctin 2 mRNA, 
omplete cds 


0.33 


<NONE> 


<NONE> 


:NONE> 



11/ 
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Nearest Neighbor (BlastN vs. Genbank ) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUEl 



Viireoscma sp. outer 



493 



AF067083 



membrane protein 
homolog gene, 
complete cds; Tip 
repressor binding 
protein gene, partial 
cds; and unknown 

g enes 



0.33 



401553 



HYPOTHETICAL 24.5 KD 
PROTEIN IN NADB-SRMB 
INTERGENIC REGION 



8.3 



494 



Y 15520 



Papio hamadryas 
anubis gene encoding 
fertilin alpha-II 



0.29 



2408049 



(Z99164) hypothetical protein 



lyp othetical pre 
AkVL HYDROCARBON 
RECEPTOR NUCLEAR 
TRANSLOCATOR 
HOMOLOG (DARNT) 
(TANGO PROTEIN) 
transcription factor [Drosophila 
melanogasterl 



495 



U33475 



Alestes sp. 
ependymin mRNA, 
partial cds 



0.28 



3913078 



3.1 



496 



D88356 



Mouse DNA for 8- 
oxodGTPase. 
complete cds 



0.22 



<NONE> 



<NONE> 



<NONE> 



497 



U67603 



Methanococcus 
jannaschii section 145 
of 150 of the 
complete genome 



0.22 



2209261 



(U5I222) p40 [Streptomyces 
halstedii] 



8.3 



498 



U82386 



Malurus cyaneus 
microsatellite McyU2 



0.22 



499 



Z49625 



S.cerevisiae 
chromosome X 
reading frame ORF 
YJR125C 



992631 



(U29131) Mg-chelatase subunit 
Synechocystis sp.] 



0.56 



0.21 



500 



U64830 



Dictyosielium 
discoideum AX2 
protein tyrosine 
kinase gene, complete 
cds. 



<NONE> 



<NONE> 



<NONE> 



0.21 



<NONE> 



<NONE> 



<NONE> 



501 



M24543 



Human prostate- 
specific antigen (PA) 
gene, complete cds. 



0.21 



2764859 



X9791S) gene 12.1 
[Bacteriophage SPP1] 



6.0 
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: Nearest Neighbor fBIastN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non- Redundant Proteins* 


SEQ 
ID 


ACCESSIO 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












uOOO'ib protein - 




502 


X87618 


B.taurus mRNA for 
thrombospondin 
(partial) 2162Jjp 


0.21 


2146000 


Mycobacterium tuberculosis 
tuberculosis] 

>gi|1694863|gnl|PID|e283373 
(Z83018) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis j_ 


3.5 


503 


X71591 


B.taurus 
microsatellite 
sequence INRA048 


0.21 


1354453 


CU52830) orf [Homo sapiens] 


2.7 


504 


X57808 


immunoglobulin 
lambda light chain 
gene 


0.21 


2119158 


procollagen type V alpha 2 - 
mouse>gi|309181 


2.7 


505 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 


0.21 


2497139 


h v I'cri'HE'ricAL ;o kd — 

PROTEIN IN ABF2-CHL12 
INTERGENIC REGION 
>gi|1078003|pir||S52835 
hypothetical protein YMR075w 
yeast (Saccharomyces 
cerevisiae) >gi|763022 
(Z43952) unknown 
Saccharomvces cerevisiae] 


2.0 


506 


r ISA') 1 £L 


Mycobacterium 

lUIlUllUHl UlddllllU 

pJAZ38 replication 
protein Rep (rep) 
gene, complete cds 


0.21 


2499087 


.1 \u ' 

Ufr- 

GLUCOSE:GLYCOPROTEIN 
GLUCOSYLTRANSFERASE 
PRECURSOR (DUGT) 
glucosy (transferase - fruit fly 
(Drosophila sp.) 
glucosyltransferase precursor 
Drosophila melanogaster] 


0.003 


507 


U31463 


Rattus norvegicus 
uuninuscic myosin 
leavy chain-A 
mRNA, complete cds. 


0.21 


3880111 


i-Ol LJVJJ LJICUIL-iCU Uolllg 

Genefinder 


0.002 


508 


X51508 


Rabbit mRNA for 
aminopeptidase N 
'partial) 


0.21 


630864 I 


LRR47 protein - fruit fly 
Drosophila melanogaster) 
>gi|4 15947 (X75760) LRR47 
Drosophila melanogaster] 


le-06 


509 


AF086476 ( 


Homo sapiens full 
ength insert cDNA 
:lone ZD88F12 


0.20 


<NONE> 


<NONE> 


<NONE> 


510 


1 
1 

AF077006 ( 


Helicobacter pylori 
)lasmid pHPM186. 
:omplete sequence 


0.20 


<N0NE> 


<NONE> 


cNONE> 


511 


X75480 I 


E.gunnii CAD pene. 


0.20 


<NONE> 


<NONE> 


cNONE> 
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SEQ 
ID 


ACCESSIOr« 


i^eienoor (oiusun vs. \ 
I DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


toteins) | 
P VALUE 






T.acstivum 










512 


X75036 


mitochondrial nad7 
gene for NADH 
dehydrogenase 
subunii 7 


0.20 


<NONE> 


<NONE> 


\JL\ O | 


513 


D90875 


E.coli genomic DNA 
Kohara clone 
#422(55.5-55.8 min.) 


0.20 


<NONE> 


<NONE> 


<NONE>| 


514 


Z68343 


Caenorhabditis 
elegans cosmid 
F59B8, complete 
sequence 
[Caenorhabditis 
elegans] 


0.20 


<NONE> 


<NONE> 


<NONE>| 


515 


X62486 


M.musculus V alpha 
1 1.1 gene 5'-region 


0.20 


<NONE> 


<NONE> 


<NONE>| 


516 


AF040651 


Caenorhabditis 
elegans cosmid 
W04H10 


0.20 


1170683 


VHUSPHUK Y LASL B 

KINASE ALPHA 
REGULATORY CHAIN. 
SKELETAL MUSCLE 
ISOFORM 

(PHOSPHORYLASE KINASE 
ALPHA M SUB UNIT) 
>gi|2135923|pir||I38111 
phosphorylase kinase (EC 
2.7.1.38) - human >gi|791043 


7.4 1 


517 


U 10470 


Pseudomonas 
fluorcsccns PHA 
depolymerase (phaZ) 
gene, complete cds. 


0.20 


3721862 


(ABO 16024) Pfj2 [Plasmodium 
alciparum] 


1.9 I 


518 


D83778 


Human mRNA for 
KIAA0194 gene, 
partial cds 


0.20 


■ 

] 

' 126363 1 


LAM IN IN ALPHA- 1 CHAIN 
PRECURSOR precursor - 
luman 


0.65 I 


519 


< 

S43579 i 


:-scr=pp60c-src, 
>dr=src downstream 
egion 


0.20 


( 
r 
t 

4159887 [ 


AC004908) similar to 
ibosomal protein L23a; similar 
oP293I6(PDD:gl32848) 
Homo sapiens] 


052 1 


520 


1 
t 

( 

U07357 c 


vlus musculus Balb/c 
)rain-specific kinase 
Bsk) mRNA. 
omplete cds. 


0.20 


206712 p 


M64793) salivary proline-rich 
rotein [Rattus norvegicus] 


0.51 1 



«1s 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















521 


AF034460 


f enicillium tnomu 
internal transcribed 
spacer 1, 5.8S 
ribosomal RNA gene 
and internal 
transcribed spacer 2, 
complete sequence; 
and 28S ribosomal 
RNA gene, partial 
sequence 


0.20 


114136 


AMINO-ACID 
ACETYLTRANSFERASE 
Pseudomonas aeruginosa 
>gi|151036 (M38358) N- 
acetylglutamate synthase 
[Pseudomonas aeruginosa] 


0.39 


522 


U95098 


Xcnopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 


0.20 


2842674 


r/L7U DUMAJJN LLASi Z, 

ASSOCIATING FACTOR 1 (B- 
CELL-SPECIFIC 
COACTIVATOR OBF-1) (OCT 
BINDING FACTOR 1) (BOB- 
1) (OCA-B) Bobl. B-cell- 
speoific - mouse 
>gi| I SS I S 1 S|bbs| 1 79852 
mBobl=B-cell specific 
transcriptional coactivator line 
J558L, Peptide. 256 aa] 
>gi| 1353792 (U43788) Oct 
binding factor 1 [Mus musculus] 


0.073 


523 


X95971 


S.lividans groEL2 
gene 


0.20 


3925277 


(.ALUiJ64i) similar to 
Uncharacterized protein family 
UPF0034. Double-stranded 
RNA binding motif; cDNA EST 
yk4S9b3.5 comes from this 
gene; cDNA EST yk439g7.5 
comes from this gene 
Caenorhabditis elegans) 


4e-19 


524 


L41502 


vasopressin VI 
receptor (V1R) gene, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


525 


J03885 


j (.pneumoniae 
oxalacetate 
decarboxylase alpha 
subunit gene, 
complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


526 


AE001451 


Helicobacter pylori, 
strain J99 section 12 
of 132 of the 
complete genome 


0.19 


<NONE> 


<NONE> 


<NONE> 
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mm 


Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















527 


D88084 


Pedicularis 
verticillata 
chloroplast DNA. 
intergervic region 
between tmT(UGU) 
and tmL(UAA)5'exor 


0.19 


<NONE> 


<NONE> 


<NONE> 


528 


U67599 


Methanococcus 
jannaschii section 14 '. 
of 150 of the 
complete genome 


0.19 


<NONE> 


<NONE> 


<NONE> 


529 


J05500 


Human beta-spectrin 
(SPTB) mRNA. 
complete cds. 


0.19 


<NONE> 


<NONE> 


<NONE> 


530 


Y10137 


M.mycoides ftsY 
gene homologue and 
gene encoding 
hypothetical protein 


0.19 


<NONE> 


<NONE> 


<NONE> 


531 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


532 


D43805 


Mouse thymic 
stromal cell mRNA 
for TLSF-beta, 
complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


533 


AJ012585 


Tetrahymena 
thermophila 
macronuclear gene 
encoding ribosomal 
protein L3. exons 1-2 


0.19 


<NONE> 


<NONE> 


<NONE> 


534 


X51475 


Brassica napus 5- 
enolpyruvylshikimate- 
3-phosphate synthase 
gene 


0.19 


<NONE> 


<NONE> 


<NONE> 


535 


AP074386 


Sambucus nigra 
levein-Iike protein 
mRNA. complete cds 


0.19 


<NONE> 


<NONE> 


<NONE> 


536 


i 

Z49625 • 


S.cerevisiac 
rhromosome X 
•cading frame ORF 

ymi25c 


0.19 


<NONE> 


<NONE> 


<NONE> 
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mm 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



H.sapiens pilot 



DESCRIPTION 



P VALUE I 



538 



X63741 



raRNA 



0.19 



Y11255 



O.latipes mRNA for 
annexin max4 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



L<N0NE> 



539 



540 



L63537 



X70903 



Oncorhynchus mylciss 
(clone Jb-10) beta-2 
microglobulin (B2m) 
mRNA. complete cds 



0.19 



N. tobacuin T92 gene 
for auxin-binding 
protei n 



<NONE> 



<NONE> 



0.19 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



54 



542 



U61958 



Caenorhabditis 
elegans cosmid 
C25A8 



0.19 



<NONE> 



U33959 



Macaca fascicularis 
fertilin beta mRNA, 
complete cds 



0.19 



<NONE> 



<NONE> 



<NONE> 



! <NONE> 



<NONE> 



543 



Z49835 



H.sapiens mRNA for 
protein disulfide 
isomerase 



0.19 



2113940 



(Z95556) hypothetical protein 
Rv2507 



9.4 



544 



AF035458 



Spinacia oleracea 
heat shock 70 protein 
protein, complete cds 



0.19 



267293 



PROBABLE E4 PROTElNf 
papillomavirus (type 1) 
>°i|61015 (X62844) E4 gene 
product f Pygmy chimpanzee 
papillomavirus type 11 



9.4 



545 



546 



547 



548 



U23441 



Tetrahymena 
thermophila B 
ntemal deletion 
sequence. 



0.19 



U53921 



Pneumocystis carinii 
major surface 
glycoprotein 



3877185 



(266563) F46C3.2 
[Caeno rhabd itis elegans] 



LI 1002 



Rat ankyrin binding 
glycoprotein- 1 related 
mRNA sequence. 



U67560 



Methanococcus 
jannaschii section 102 
of 150 of the 
complete genome 



0.19 



3548901 



(AF0525O2) DA26 homolog 
Epiphyas postvittana 
nucleopolyhedrovirus] 



0.19 



3337352 



(AC004481) putative chromatin | 
structural protein Supt5hp 



0.19 



3183689 



(Y 13585) serotonin receptor 4 
Caviu porcellusl 



9.3 



9.3 



9.1 



8.7 



549 



U 18424 



Mus musculus 
bacteria binding 
macrophage receptor 
MARCO mRNA. 
complete cds. 



0.19 



3659853 



AF0S90S3) complement 
component ClqB like protein 



7.1 



£ 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 1 


SEQ 

ED 


ACCESSIOI^ 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(U58751)C07G1.7gene 




550 


X66467 


C.albicans sec 1 8 gern 


: 0.19 


1326385 


product (Caenorhabditis 
eleaansl 


6.9 


551 


AF003487 


Syngaster lepidus 165 
ribosomal RNA gene 
partial sequence 


0.19 


3122039 


DIHYDROPYRJMIDEMASE 
(DHPASE) dihydropyrimidinas< 
- rat 

>gi|1378019|gnl(PID|dl010479 


6.9 1 


552 


J05087 


Rat calmodulin- 
sensitive plasma 
membrane Ca2+- 
transporting ATPase 
(PMCA3) mRNA, 
complete cds. 


0.19 


422462 


hypothetical protein - fruit fly 
(Drosophila melanogaster) 
>gi|296434 (X68408) ORF 
[Drosophila melanogaster] 


5.3 I 


553 


AF080464 


Homo sapiens 
glutamate 
oxaloacetate 
transaminase 


0.19 


3024834 


PROBABLE E4 PROTEIN 
>gi|790898 position 3286.-3288 
is first start codon; putative 


53 I 




U78876 


Human MEK kinase 
3 mRNA. complete 
cds 


0.19 


1710445 


(U78083) unknown [Emericella 
nidulansl 


5 3 I 


555 


AB009077 


Vigna radiata mRNA 
for proton 
pyrophosphatase, 
complete cds 


0.19 


3256922 


(AP000002) 256aa long 
hypothetical protein 
Pyrococcus horikoshii] j 


5.1 I 


556 


U95098 


Xenopus laevis 
mitotic 

jhosphoprotein 44 
mRNA. partial cds 


0.19 


4226159 


(AF 125463) contains similarity 
to BTB (also known as BR- 
C/Ttk) domains (Pfam:PF0065 1 . 
Score=62.8. E=7.6e-15, N=l) 1 
[Caenorhabditis elegans] | 


4.1 1 


557 


1 
I 

c 

AE000392 c 


Escherichia coli K-I2 
VIG1655 section 282 
)f 400 of the 
ompleie genome 


0.19 


i 
< 
t 
: 
f 
r 

3645960 C 


'AL031583) I- 

:vidence=predicted by content; 1 
l-meihod=genefinder;084; 1- 1 
■nethod_score=47.46; 1- 1 
:vidence_end; 2- 
:vidcncc=predicted by match; 2- 
natch_;tccession=SWISS- 
5 ROT:P23792; 2- I 
natch_description=DISCONNE 
:TED PROTEIN.; 2-matc... | 


4.0 1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbanlc) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (Blasi.X vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUEl 



558 



AE000392 



Escherichia coli K-l 
MG1655 section 282 
of 400 of the 
complete genome 



0.19 



3645960 



1 IALUjIoBJ) I- 

jcvidcnce=predicted by content; 
1 1 -method=genefinder;084; 1 
|method_score=47.46; 1- 
|evidence_end; 2- 
|evidence=predicted by match; 2 

match_accession^S WISS- 

PROT:P23792; 2- 
|match_description=DISCONNE 
ICTED PROTEIN.; 2-matc... 



4.0 



559 



L81774 



Homo sapiens 
(subclone 3_dl from 
PI H25) DNA 
sequence 



0.19 



(AB015981)MnhA 
4001725 [[Staphylococcus aureus] 



3.0 



560 



AL021108 



Drosophila 
melanogaster cosmid 
clone 137E7 



0.19 



(.ABO 157 18) protein kinase 
j4001688 IfHomo sapiens] 



3.0 



561 



AB 001510 



Carabus 
leptoplesioides 
mitochondrial DNA 
for NADH 
dehydrogenase 
subunit 5, partial cds 



0.19 



3758855 



(Z9855 1) MAL3P6.11 
[[Plasmodium falciparum] 



562 



563 



AF069696 



Egernia stokes ii clone 
EST1 microsatellite 



0.19 



3328994 



(AE001326) Amino Acid 
[(Branched) Transport 
|[Chlamvdia trachomatis] 



2.4 



2.4 



X64144 



F.pringlei ppcAl 
gene for 
phosphoenolpyruvate 
carboxylase 



0.19 



3242974 



(AF069555) G protein-coupled 
[receptor p2y3 [Meleagris 
Igallopavo] 



2.3 



564 



U56897 



[uman 
immunodeficiency 
virus type 1 gag 
polyprotein (gag) 
gene, partial cds 



0.19 



(U73041) resolvase-likc protein 
22577 1 0 |[Thiobacillus ferrooxidans] 



2.3 



565 



U57975 



Danio rcrio Notch 
homoloeue 3 mRNA. 
complete cds 



0.19 



3874971 



KZW7Wj similar to NAD 
| dependant 

epimerase/dehydrause family; 
IcDNA EST EMBL.C10103 
[comes from this gene; cDNA 
EST EMBL:D66400 comes 
from this gene; cDNA EST 
EMBL:D70143 comes from this 
gene; cDNA ESTyk493hl 1.3 
Icomes from ... 



1.8 



WO 01/02568 



PCT/US00/18374 





< Nearest Neighbor (BlastN vs. Gcnbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins! 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


| DESCRIPTION 


P VALUE 












| masquerade precursor - truii fly 




366 


Y 12502 


R.norvegicus raRNA 
for factor Xllla 


0.19 


2133693 


(Drosophila melanogaster) 
>gi|665545 (U18130) 
masquerade [Drosophila 
melanogaster] 
>gi|1095942|prf||2110286A 
masquerade gene 


1.8 


567 


S82470 


BBl=malignant cell 
expression-enhanced 
gene/tumor 
progression-enhancec 
gene [human, UM- 
UC-9 bladder 
carcinoma cell line, 
mRNA. 1897 nt] 


0.19 


2444026 


(U77783) N-methyl-D-aspartate 
[receptor 2D subunit precursor 
[Homo sapiens] 


1.8 1 


568 


U97408 


Caenorhabditis 
elegans cosmid 
F48A9 


0.19 


542433 


225K protein - Babesia bovis 
(fragment) 


18 I 


569 


U 10470 


Pseudomonas 
fluorescens PHA 
depolymerase (phaZ) 
gene, complete cds. 


0.19 


3721862 


(ABO 1 6024) Pfj2 [Plasmodium 
falciparum] 


1.7 1 


570 


M88160 


Ovis aries MAF214 

locus polymorphic 

dinucleotide repeat . 
nuiLm> uiuim r 


0.19 


1293816 


(U56963) T13A10.5 gene 
product [Caenorhabditis 
eleaansl 


1.4 I 


571 


« 
f 

AJ131336 I 


mRNA for pollen 
allergen (Hoi 1 2. 
group II) > :: 
emb|AJ131339|LIT13 
1339 Lolium italicum 
mRNA for pollen 
allergen (Lol i 2, 
group II) > allergen 
'Poa p 2, group II) > 

:mb|AJ13l33S|TAEl 
51338 Triticum 
lestivum mRNA for 
)Ollen allergen (Tri a 
!. group II) 


0.19 


( 

3880447 C 


AL032675) predicted using 1 
3enefinder j 


0.82 I 


572 


X84036 U 


>.cercvisiae ARG8 
nd CDC33 senes 


0.19 


3882041 |( 


AJO 10405) hypothetical protein! 


0.62 1 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pn 


steins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human WD protein 






mucin - human >gi|501033 




573 


U57058 


DUO pre-mRNA. 
partial cds 


0.19 


631302 


(U 14383) mucin [Homo 
sapiens] 


0.60 


574 


AF034460 


Penicillium thomii 
internal transcribed 
spacer 1, 5.8S 
nbosomal RNA gene 
and internal 
transcribed spacer 2. 
complete sequence; 
and 28S ribosomal 
RNA gene, panial 
sequence 


0.19 


114136 


AMINO- ACID 
ACETYLTRANSFERASE 
Pseudomonas aeruginosa 
>gi|151036(M38358)N- 
acetylglutamate synthase 
[Pseudomonas aeruginosa] 


0.35 


575 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


0.19 


105270 


alpha-2radrenergic receptor - 
human name 'ADRA2R' [Homo 
sapiens] 


0.27 


576 


AG001475 


Homo sapiens 
genomic DNA. 21q 
region, clone: 
125H6N2 


0.19 


94977 


hypothetical protein 3 - 
Pseudomonas sp. (DSM 6898) 
plasmid pKB740 >gi|45867 
(X66604) ORF3 


0.16 


577 


M63284 


Mouse IgG receptor 
(beta-Fc-gamma-RII) 
gene, exons 9 and 10, 
clones lambda- 
Fc(3.2.93). 


0.19 


3024681 


TRANSCRIPTION 
INITIATION FACTOR TFITD 
135 KD SUBLTNTT (TAFH-135) 
(TAF1I135) (TAH1-130) of 
RNA polymerase II transcription 
factor TFIID [Homo sapiens] 


0.088 


578 


U38241 


Pseudomonas 
aeruginosa orotate 
phophoribosyl 
transferase (pyrE), 
catabolite repression 
control protein (crc) 
and RNasePH (rph) 
genes, complete cds 


0.19 


3044086 


(AF055904) unknown 
[Myxococcus xanthus] 


0.052 


579 


AF039734 


Lontra longicaudis 
transthyretin intron I, 
partial sequence 


0.19 


322759 


pistil extensin-like protein 
(clone pMG14) - common 
tobacco (fragment) >gi| 19927 
(Z 1401 5) pistil cxtensin like 
protein [Nicotiana tabacum] 


0.030 


580 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA. 
complete cds 


0.19 


2147194 


collaaen - Paralvinella srasslei 


0.002 



2ou 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



58 



582 



DESCRIPTION 



P VALUE 



AB 004232 



AF098919 



Drosophila 
melanogaster mRNA 
for DAD polypeptide 
complete cds 



Gallus gallus alpha- 
globin gene domain 
region 



0.19 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



2498765 



0.19 



1086863 



P VALUE 



PEROXISOMAL MEMBRANE 
PROTEIN PEX16 lipolytical 



(U41272)T03G11.6 gene 
product [Caenorhabditis 
elegans] 



0.002 



4e-05 



583 



AE001457 



Helicobacter pylori, 
strain J99 section 1 8 
of 132 of the 
complete genome 



0.19 



2924552 



(AL022018) 1- 

evidence=predicted by content; 
I-method=genefinder;084; 1- 
method_score= 165.48; 1- 
evidence_end; 2- 
evidence=predicted by match; 2 
match_accession=AA264666; 2 
match_description=LD0835 1 -5p 
rime LP Drosophila melanoga 



3e-05 



584 



L 10329 



Plasmid RP4 traE 
gene. 3" end; traD 
gene, complete cds; 
traF gene. 5' end. 



0.19 



3878117 



(Z49068) mitochondrial carrier 
protein 



8e-07 



585 



AE001155 



Borrelia burgdorferi 
(section 41 of 70) of 
the complete genome 



0.19 



861276 



(U28739) similar to TPR 
domains in e.g. yeast STI1 
rotein [Caenorhabditis elegans) 



2e-12 



586 



U49979 



Orf virus E10R 
homolog gene, partial 
cds. and DNA 
polymerase gene 
complete cds 



0.19 



3850072 



(AL033385) dna-directed rna 
polymerase iii subunit 
[Schizosaccharomyces pombel 



lc-15 



587 



U88155 



Xenopus laevis 
RanGTPasc 
activating protein 



0.19 



995714 



(X91258) pid:el98503 
[Saccharomyces cerevisiae] 



4e-16 



588 



AF06I854 



Schizosaccharomyces 
pombe Clr4p (clr4) 
ene. complete cds 



589 



M23865 



S.cerevisiae CHS2 
gene encoding chitin 
synthase. 



0.19 



3242750 



(AC005 164) match to ESTs 
AA731 149 (NID:g2 140138). 
AA731908 (NID:g2752719). 
AA2S7837 (NID:gl9335 19). 
AA262S11 lNID:gl898382). 
and AAS25820 (NTD:g2899132) 



0.1S 



<NONE> 



<NONE> 



5e-19 



<NONE> 
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SEQ 
ID 


? Nearest 
ACCESSIOf 


Neighbor (BlasiN vs. 
4 DESCRIPTION 


jenbank) 
P VALUE 


Nearest Nei^h 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


590 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.18 


<NONE> 


<NONE> 


<NONE> 


591 


AF0676I0 


Caenorhabditis 
elegans cosmid 
F41A4 


0.18 


i <NONE> 


<NONE> 


<NONE>| 


592 


AF036329 


Homo sapiens 
gonadotropin- 
releasing hormone 
precursor, second 
form (GnRH-II) gene 
complete cds 


0.18 


<NONE> 


<NONE> 


<NONE>| 


593 


Z492I6 


H.sapicns 
mitoxantrone- 
resistance associated 
mRNA 


0.18 


<NONE> 


<NONE> 


<NONE> 


594 


X02167 


Torulopsis glabrata 
mitochondrial DNA 
for tRNA-Thr,-His 
and -Glu upstream of 
cytochrome b gene 


0.13 


<NONE> 


<NONE> 


<NONE>| 


595 


Z31561 


R.communis 
(Carmencita) Scrl 
mRNA for sucrose 
carrier 


0.18 


<NONE> 


<NONE> 


<NONE>| 


596 


L81692 


■fomo sapiens 
(subclone 2_c9 from 
PI H56) DNA 
sequence 


0.18 


1346575 


55 KD ERYTHROCYTE 
MEMBRANE PROTEIN 


8.4 1 


597 


< 

5 

i 

X57310 s 


^ocardia 

actamdurans pcbAB 
ind pcbC genes for 
»lpha-aminoadipyl-L- 
:ysteinyl-D- valine 
ynthetase and 
sopenicillin N 
ynthase 


0.18 


< 

126404 I 


>EED LIPOXYGENASE-2 (L- 
) soybean >gi|170014 (J0321 1) 
poxvaenase (EC 1.13.11.12) 


6.5 1 


598 


£ 
F 

( 

U18315 c 


us scrota 

arathyroid receptor 
PTH) mRNA, 
omplcte cds 


0.1S 


( 

1022323 c 


K04647) collagen alpha-2(IV) 
hain [Mus musculus] 


3.8 | 



10 f 



WO 01/02568 



PCT/US00/18374 



sec 

ID 


i Nearest 
> 

ACCESSIOf 


Neighbor (BlastN vs. 
1 DESCRIPTION 


3 embank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


Toteins) 
P VALUE 


599 


ALO 10158 


Plasmodium 
falciparum DNA *** 

SEQUENCING IN 
PROGRESS •** 

JIU1I1 WUIllig J"OJ, 

complete sequence 


0.18 


2506816 


VkKMCAN LURE PROTEIN" 

PRECURSOR 
PROTEOGLYCAN CORE 
PROTEIN 2) (GLIAL 
H YAL URON ATE- B ENDING 
PROTEIN) (GHAP) >gi|6085 15 
(U16306) chondroitin sulfate 
proteoglycan versican V0 splice 
variant precursor peptide 


3.7 


600 


AB 005287 


Bos taurus mRNA for 
thrombospondin I, 
complete cds 


0.18 


2146000 


uU001!b protein - 
Mycobacterium tuberculosis 
tuberculosis] 

>gi|1694863|gnl|PID|e283373 
(Z8301S) hypothetical protein 
Rv2968c [Mycobacterium 
tuberculosis] 


2.9 


601 


AT (Y? 1 1 nn 


Drosophila 
melanogaster cosmid 
cione i j /t / 


0. 1 8 


3483032 


(AL031371) hypothetical 
protein SC4G2.06 
[Streptomvces coelicolor] 


2.9 


602 


I JS7Q7< 


Danio rerio Notch 
homologue 3 mRNA. 
complete cds 


•0. IS 


85719 


collagen alpha l'(II) chain 
precursor - African clawed fro? 


1.7 


603 


M30124 


P.aeruginosa 
autonomously 
replicating sequence. 


0.18 


3878017 


( ALU2 1 JS7) similar to ^inc 
linger, C4 type (two domains): 
cDNA EST ylc452f4.5 comes 
from this gene; cDNA EST 
EMBL:T00774 comes from this 
gene receptor NHR-3 
Cuenorhabditis elegans] 


1.3 


604 


X54965 


j.sp alpha 5HR DNA 


0.18 


134304 


STEM CELL PROTEIN j 
rhicken >gi|62845 (X63371) 
ranst'orming capacity (Gallus 


1. J 


605 


i 
1 

U95098 r 


Kenopus laevis 
nitotic 

>hosphoprotein 44 
nRNA. partial cds 


0.18 


( 

1628403 s 


X98S93) hTAFII68 [Homo 
apiens] splicing [Homo 
apiens] 


1.3 


606 


( 
c 
s 

P 
( 

U20793 c 


Jryctolagus 
uniculus renal 
odium-dependent 
hosphate transporter 
ype II mRNA. 
omplete cds. 


0.1S 


S 
C 
F 

P 

( 

1705984 c 


2 KD TYPE IV 
JOLLAGENASE 
RECURSOR IV, 92K. 
recursor - rat >gi| 1022784 
U'36476) 92-kDa type IV 
ollanenase [Rattus norvesicus] 


1-2 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


S Nearest 
2 

ACCESSIOf 


Neighbor (BlastN vs. 
M DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neioh 
ACCESSION 


J DESCRIPTION 


totems) 
P VALUE 


607 


U23427 


Human 

cholecystokinin type 
A receptor (CCK-A) 
gene, exons 1 and 2. 


0.18 


3261734 


KZ94752) hypothetical protein 
Rv 1004c 


0.97 


608 


U49953 


Ratrus norvegicus 
protein kinase MUKT 
mRNA. complete cds 


0.18 


551238 


(X81847) pectate lyase I 
[Erwinia carotovora] 


0.43 


609 


J00182 


Human alpha globin 
gene cluster on 
chromosome 16: zeta 
gene. 


0.18 


1585259 


traj gene [Amycolatopsis 
Imethanolica] 


0.41 


610 


X62513 


M.gallopavo gene for 
metallothionein 


0.18 


2494740 


HYPOTHETICAL 28.3 KD 
PROTEIN IN GBD 5 'REGION 
(ORF4) >gi|2120954|pir||I39562 
ORF4 - Alcaligenes cutrophus 
>gi|695274 (L36817) ORF4 


0.31 


611 


X04862 


Goat embryonic alpha 
globin gene zeta 
exons 2-3 


0.18 


86837 


androaen receptor B - human 


0.082 


612 


Ml 2450 


Rat vitamin D 
binding protein 
(DBP) mRNA. 
complete cds. 


0.18 


4210432 


(AJ1307S3) APC2 protein [Mus 
musculus] 


0.038 


Ail 

01 J 


AF038539 


Mus musculus muscle 
NSP-like 1 (Nspll) 
mRNA. complete cds 


0.1S 


3297877 


(AJ224S68) GNAS 1 [Homo 
sapiens] 


0.029 


614 


i 
1 
c 
< 

c 

X78401 c 


Bacteriophage P22 
right operon. orf 48. 
'eplication genes 18 
ind 12. nin region 
jenes, ninG 
mosphatase. late 
ontrol gene 23, orf 
)0, complete cds, late 
ontrol region, start 
f lysis gene 13 


0.18 


( 
P 

1123087 e 


U42436) C49H3.3 gene 

roduct [Caenorhabditis 1 

leaans] 1 


0.009 


615 


F 
a 
ii 

D38754 h 


ig mRNA for inter- 
Ipha-trypsin . 
ihibitor heavy-chain 
I. complete cds 


0.1S 


(1 
P 

1397275 le 


J61947) C06G3.8 gene 
roduct [Caenorhabditis 
eaans) 


7e-06 
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1 Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant PmreinO 


SEQ 
ID 


ACCESSIOr- 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












LRR47 protein - fruit fly 




616 


X51508 


Rabbit mRNA for 
aminopeptidase N 
(partial) 


0.18 


630864 


(Drosophila melanogaster) 
>gi|4 15947 (X75760) LRR47 
[Drosophila meianoeaster] 


6e-07 


617 


X54850 


S.kluyveri linear 
plasmid pSKL DNA 
for open reading 
frames 1-10 


0.18 


3183405 


1 kypolHtncAL rr.TKD — 

PROTEIN C2C6.07 IN 
CHROMOSOME I 
>gi|2370504|gnl|PID|e339 194 
pombe] 

>gi|345 1 305|gnl|PID|e 1 3 1 6730 
(AL031324) very hypothetical 
protein [Schizosaccharomyces 
pombe | 


2e-08 


618 


L21954 


Human peripheral 
benzodiazepine 
receptor aene. exon 4. 


0.18 


3925211 


fALUjibibj clJNA EST 

EMBL:D70654 comes from this 
gene: cDNA EST 
EMBL:214359 comes from this 
gene: cDNA EST 
EMBL:D33409 comes from this 
gene; cDNA EST 
EMBL.D36239 comes from this 
gene; cDNA EST 
EMBL:Z 14766 comes from this 
j!ene... 


4e-09 


619 


U09355 


Oryctolagus 
cuniculus protein 
phosphatase 2A1 B 
gamma subunit 
(skeletal muscle 
isolate) mRNA, 
complete cds. 


0.18 


3947877 . 


(AL0343S2) putative mitosis 
and maintenance of ploidy 
protein [Schizosaccharomyces 
pombe | 


8e-U 


620 


X58715 


r.cruzi hsp70 mRNA 
for 70 kDa heat shock 
protein, partial cds 


0.18 


3024081 | 


MYOSIN LIGHT CHAIN 
KINASE. SMOOTH MUSCLE 
AND NON-MUSCLE 
ISOZYMES (MLCK) 
CONTAINS: TELOKIN) 


9e-12 


621 


1 

1 
1 

AF060195 t 


vlus musculus 
jroteasome regulator 
3 A28 beta subunit 
»ene. complete cds 


0.1S 


( 
c 

861276 f 


U2S739) similar to TPR 
lomains in e.g. yeast STI1 
>rotein [Caenorhabditis eleeans] 


Ie-14 


622 


I 
e 

L27235 c 


vlethylobacterium 
xtorquens serine 
vcle proteins 


0.1S 


( 

2688949 f 


AF02720S) AC 133 antigen 
Homo sapiens] 


le-14 
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OCV, 

ID 


£ Nearesi 

! 

ACCESSIOr 


Neighbor (BlastN vs. 
* DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neiah 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Vo terns) 
P VALUE 


623 


AF006573 


Drosophila virilis 
maltase I (Mavl) anc 
maltase 2 (Mav2) 
genes, complete cds 


0.18 


2500558 


MUTATIVE R1BONUCLEASE 
III (RNASE III) 

>gi|3876420|gnl|PID|el346O63 

1 ' All t /111 c 1 111 I ] -1 F tr^ ' — • aI _ 

l\j / aimuar to noonucieas 
_ (Caenorhabditis elegans] 


2e-23 


624 


AF0O1782 


Staphylococcus 
aureus strain SAS02A 
AerB 


0.17 


<NONE> 


<NONE> 


<NONE> 


625 


AJ223364 


Homo sapiens germ- 
line DNA upstream o 
Jkappa locus 


0.17 


<NONE> 


<NONE> 


<NONE> 


626 


J03059 


Human 

glucoccrcbrosidase 
(GCB) gene, 
complete cds 


0.17 


<NONE> 


<NONE> 


<NONE> 


627 




Fugu rubripes Cal2 
gene for pheromone 
receptor, complete 
:ds 


0.17 


2198849 


(AI-UJ4yUUj h3KAKP IHomo 
sapiens] >gi|2665826 
(AF035771) Na+/H+ exchanger 

rconlnrnrv farrnr 0 riTr»rw/> 
i^gUiaiui y l tic l\Ji [nuulO 

sapiens] factor 2 [Homo 
sapiens] 

>gi|36I8353|gnI|PtD|dl034182 
exchanger isoform A3 [Homo 
sapiens] 


7.8 


628 


i 

{ 
c 

I 

AF027174 c 


^rabidopsis thaliana 
cellulose synthase 
'atalytic subunit (Ath- 
i) mRNA. complete 
ds 


o.n 


< 

539355 v 


CD25 protein (version 1) - 
east 


7.5 


629 


I 
d 

( 

AF059650 c 


tomo sapiens histone 
eacetylase 3 
HDAC3) gene, 
omplete cds 


0.17 


h 

482118 C 


ypothetical protein CI5H7.1 - 
aenorhabditis clegans 


4.5 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Gcnbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE! 



630 | AFO59650 



631 | X55065 



Homo sapiens histone 
deacetylase 3 
(HDAC3) gene 
complete cds 



Chinese hamster 
metallothionein II 
gene 



0.17 



465932 



632 | U15230 



Rattus norvegicus 
oxytocin receptor 
(OTR) gene, exon 3 
and complete cds 



633 1 X04862 



634 I ALO 10222 



633 1 X60111 



636 I U49979 



Goat embryonic alpha 
globin gene zeta 
exons 2-3 



0.17 



3687237 



0.17 



542565 



HYPOTHETICAL 03.2 ICD 
PROTEINF58A4.il IN 
CHROMOSOME HI 
>gi|3874287|gnl|PID|el344088 
EST EMBL:C 12577 comes 
from this gene; cDNA EST 
yk227e7.5 comes from this 
gene; cDNA EST yk303dl.5 
comes from this gene; cDNA 
EST ylc.3 14c 12.5 comes from 
this gene; cDNA ... 
EMBL:C1 1886 comes from this 
gene; cDNA EST 
EMBL:C 12577 comes from this 
gene; cDNA EST yk227e7.5 
comes from this gene; cDNA 
EST yk303dl.5 comes from this 
gene; cDNA EST yk3I4c!2.5 
comes from this gene; cDNA . 



4.4 



(AC005 169) putative Cys3His 
zinc-finger protein 



Plasmodium 
falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 4-09, 
complete sequence 



H. sapiens mRNA for 
MRP-1 



0.17 



86837 



0.17 



1177322 



cyclin E type U - fruit fly 
(Drosophila melanogastcr) 
>gi|429168 (X75027) 
Drosophila cyclin E type II 
[Drosophila melanogasterl 



1.5 



androgen receptor B - human 



0.45 



0.080 



CX95466) CPG2 protein [Rattus 
norvegicus] 

>gi|158S593|prf||2208498A 
plasticity-related gene [Rattus 
norvegicus) 



Orf virus El OR 
homolog gene, partial 
cds, and DNA 
polymerase gene, 
complete cds 



0.17 



3237306 



0.17 



3850072 



(U92715) breast cancer 
antiestrogen resistance 3 protein 



(AL033385) dna-directed rna 
polymerase iii subunit 
fSchizosaccharomyces pom be I 



7e-07 



3e-09 



7e-15 



WO 01/02568 



PCT/US00/18374 





J Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pmt«nti 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















637 


AF006573 


Drosophila virilis 
maltase 1 (Mavl) anc 

malticp 0 
[HallUSG — \ivlilV^J 

genes, complete cds 


0.17 


2500558 


PUTATIVE RIBONUCLEASE 
III (RNASE m) 
>gi|3876420|gnl|PID|el346063 
(Z81070) similar to ribonucleasi 
(Caenorhabdins elegans] 


2e-29 


638 


AE001141 


Borrelia burgdorferi 
(section 27 of 70) of 
the complete senome 


0.16 


1850327 


(U52370) fertilin beta [Homo 
sapiens] 


2.3 


639 


M72980 


Anthonomus grandis 
vitellogenin gene 
(VTG). complete cds. 


0.12 


3242750 


(AC005 164) match to ESTs 
AA73 1 149 (NID:g2l40138), 
AA73 1908 (NID:g2752719), 
AA287837 (NID:gl933519). 
AA2628I1 (NID:gl898382). 
and AA825820 (NID:g2899132) 


2e-56 


640 


AF023532 


Simulium vittatum 

nlr<UC o gene, 

mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


0.11 


<NONE> 


<NONE> 


<NONE> 


641 


U76523 


Sambucus nigra lectin 
precursor mjRNA. 
complete cds 


0.10 


3482965 


( AL03 1 369) putative protein 


0.49 


642 


AJ00I596 


Danio rerio mRNA 
for opioid receptor 
lomolosue 


0.099 


1706694 


LANOSTEROL SYNTHASE 
5.4.99.7) - fission yeast 




643 


U26341 < 


Oryctolagus 
funiculus Na and CI 
dependent betaine 
ransporter mRNA, 
:omplete cds. 


0.099 


2645804 


:AF033381) betaine 
lomocysteine methyl transferase 
Mus musculus] 


0.59 


644 


I 

( 

i 

M 11633 r 


Bacteriophage Cp-5 
S. pneumoniae) 3' 
nverted terminal 
epeat. 


0.082 


( 

2314695 ( 


AE000649) type IIS restriction 
nzyme R and M protein 


4.3 


645 


X74103 s 


Jtreptomyces sp. 
;ene for alkaline 
erine protease I 


0.073 


( 

1314734 f 


U54641) 220 kDa silk protein 
Chironomus thummi] 


6.3 



7A0 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


! Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


* DESCRIPTION 


P VALUE 






Caenorhabdins 










646 


Z72509 


elegans cosmid 
F32G8. complete 
sequence 
[Caenorhabditis 
elegans] 


0.072 


<NONE> 


<NONE> 


<NONE> 


647 


X70282 


X.laevis xanf-1 gene 


0.070 


3851202 


lAi_uujyj») zo-3 [Homo 
sapiens] [Homo sapiens] 


0.40 


648 


Z69906 


Human DNA 
sequence from 
cosmid E141E2, on 
chromosome 22, 
complete sequence 
[Homo sapiens) 


0.069 


<NONE> 


<NONE> 


<NONE> 


649 


AF056940 


Drosophila virilis 

ICUUU ill IS UUoU 11 1 VI, 

complete sequence 


0.069 


2246532 


(U93S72) ORF 73, contains 
large complex repeat CR 73 


5e-l2 


650 


AJ001151 


Homo sapiens 
genomic sequence 


0.068 


<NONE> 


<NONE> 


<NONE> 


651 


X54455 


gene 17 and aene 18 


0.067 


<NONE> 


<NONE> 


<NONE> 


652 


X87936 


P.pinea internal 
transcribed spacers 1 
& 2 of ribosomal 


u.uo / 




(U95374) aldehyde 
dehydrogenase [Haloferax 
volcanii] 


4.3 


653 


AF019236 


Dicryostelium 
discoideum TipD 

\\l^JL^ J sCnc, lUIllfJIdC 

cds 


0.067 


3882275 


(AJ3U 18320) KIAA0777 protein 
Homo sapiens] 


1.1 


654 


X90592 


O.cuniculus mRNA 
For p53 protein 


• 0.067 


1703275 


METHIONINE 
AMINOPEPTIDASE 2 
(METAP 2) GLYCOPROTEIN) 
(P67) 


0.29 


655 


U41805 


vlus musculus 
putative T1/ST2 
receptor binding 
jrotein precursor 
ftlRNA. panial cds 


0.067 


642518 


,U 17326) neuronal nitric oxide 
jynthase [Homo sapiens] 


0.29 


656 


AB007881 


Homo sapiens 
KIAA0421 mRNA, 
partial cds 


0.066 


<N0NE> 


<NONE> 


<NONE> 


657 


< 

I 
f 

AL010213 c 


J lasmodium 

alciparum DNA *** 

SEQUENCING IN 

PROGRESS 

rom contig 3-109. 

omplete sequence 


0.066 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 PCT/US00/18374 



mm 
sec; 
id 


3 Nearest 
> 

ACCESSIOl 


Neighbor (BlastN vs. 
>J DESCRIPTION 


Genbank) 
P VALUE 


_ Nearest Neigh 
ACCESSION 


bor (B lastX vs. Non-Redundant Proteins) 

DESCRIPTION IpVALUeI 


658 


ABO I 4546 


Homo sapiens mRNf 
for KIAA0646 
protein, complete cds 


\ 

0.066 


1082461 




| O.J8 | 




AF 104156 


Rattus exulans isolate 
huahine30 
mitochondrial D- 
loop. partial sequence 


0.066 


1002380 


(U24 1 89) RRM-type RNA 
binding protein [Caenorhabdiris 
elegans] 


0.29 I 


660 


X97581 


M.musculus mRNA 
for spalt transcription 
factor 


0.066 


. 4107313 


(AL035075) putative myosin 
heavy chain 


0.28 I 


661 


D85378 


Human clone H20 N- 
acetylglucosaminyltra 
nsferase III DNA, 
exon 2 


0.066 


2114473 


(U96963) pUOmDia [Mus 
musculusl 


I 0.22 | 


662 


M97561 


Human (clone 

LA 179) chromosome 

21 sequence. 


0.065 


<NONE> 


<NONE> 


<NONE> J 


663 


AE001373 


Plasmodium 
falciparum 
chromosome 2. 
section 10 of 73 of 
the complete 
sequence 


0.065 


<NONE> 


<NONE> 


<NONE> I 


664 


S75479 


growth hormone 
receptor, growth 
hormone binding 
protein (GHR/BP 
gene } [mice. C57 
3lack/6, Genomic, 
179 nt. segment 8 of 
101 


ft n^^ ' 


<XnUInc> 


<NONE> } 


<NONE> 


665 


1 

( 

AF032922 c 


omo sapiens 
>ynta.\in 4 binding 
>rotein UNC-lSc 
UNC-18C) mRNA. 
omplete cds 


0.065 


( 

3061308 r 


AB006074) topoisomerase III 
Mus musculus) j 


0.82 


666 


s 
r 
r 

s 
[ 

SS09S6 3 


vp[40J=svp-related 
uclear 

eceptor/reiinoid 
ignaling modulator 
zebrafishes. mRNA. 
S76 nt] 


0.065 


( 
a 

132628S e 


U58734) weak similarity to 1 
nkyrin G [Caenorhabditis 1 
legans] I 


0.12 ! 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor CBIastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



667 | X59552 



668 | M72980 



669 | ABQI4S46 



670 | M30039 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



P VALUE 



G.domesticus mRNA 
for ventricular myosin 
heavy chain 



ACCESSION 



Anthonomus grand is 
vitellogenin gene 
(VTG), complete cds. 



0.065 



2497098 



0.065 



3242750 



Homo sapiens mRNA 
for KIAA0646 
protein, complete cds 



Sheeppox virus strain 
KS-1 ORFHM1 
gene, partial cds; 
ORF HM2 and ORE 
HM3 genes, complete 
ds; and ORE HM4 
gene, partia l cds 



0.064 



<NONE> 



0.064 



<NONE> 



DESCRIPTION 



H^PUlHbllLAL 14.1 KD 



W4U 1L1N IN AMDl-RADbZ 
INTERGENIC REGION 
>gi|1077180|pir||S49745 
probable membrane prl/tein 
YML034w - yeast 
(Saccharomyces cerevisiae) 
>gi|575685 (Z46659) unknown 
orf, len: 656, CAI: 0.13 
'Saccharomyces cerevisiael 



(AC005164) match to ESTs 
AA73 1 1 49 (NID:g2 1401 38), 
AA731908 (NID:g2752719). 
AA287837 (NID:gI933519), 
AA262811 (NID:glS98382), 
and AA825820 (NID:g2899132) 



P VALUEl 



0.014 



5e-33 



<NONE> 



<NONE> 



<NONE> 



671 



Z68013 



aenorhabditis 
egans cosmid 

W02H3. complete 

sequence 

[Caenorhabditis 

elegans] 



0.064 



<NONE> 



<NONE> 



672 I AF041332 



673 



J00451 



Bodo saltans 
unknown mRNA, 
kinetoplast gene 
encoding kinetoplast 
protein, complete cds 
Mouse germline IgG- 
3 chain gene, D-J-C 
region, and switch 
region. 



0.064 



<NONE> 



0.064 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT7US00/18374 



■TO 


M Nearest 
ACCESSIOr- 


Neighbor fBlascN vs. ( 
f DESCRIPTION 


3enbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


674 


U41289 


Dicryostelium 
discoideum K7 
kinesin-like protein 
mRNA. complete cds 


0.064 


1 3482972 


(AL031369) putative protein 


9.3 


675 


M37395 


L.lactis (strain SKU) 
proteinase plasmid 
pSKUl DNA, 
complete cds. 


0.064 


993019 


(X87246) alternative start codon 
[Pseudorabies virusl 


Q 7 


676 


Z 1 5030 


H.sapiens gene for 
ventricular myosin 
light chain 2 > :: 
gb|L01652|HUMVM 
LC Human 
ventricular myosin 
light chain 2 gene, 
seven exons. 


0.064 


730343 


f RULAL 1 m KLCkPlUK 

PRECURSOR (PRL-R) mouse 
>gi|220576|gnl|PID|dl00l535 
(D10214) prolactin receptor 
precursor [Mus musculus] 
>gi|293770 (L14811) prolactin 
receptor [Mus musculus] 
>gi|347S42 (LI 3593) prolactin 
receptor [Mus musculus] 
reccptor:ISOTYPE=long form 
[Mus musculus] 


9.1 


677 


( 

Z 12021 


3. max gene for 
:atalase 


0.064 


2498711 , 


ORIGIN RECOGNITION 
COMPLEX PROTEIN, 
SUB UNIT 2 >gi| 11 85461 
U3S472) essential ORC2- 
•elated fission replication factor 
Drp2 [Schizosaccharomyces 
jombel 


5.3 


678 


F 

s 
k 

L05668 c 


entamoeba histolytica 
irotein 

erine/threonine 
inase (pstkl) gene, 
omplete cds. 


0.064 1 


( 

733140 f 


U22453) carboxypeptidase 
Simulium vittatum] 


5.3 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


SI Nearest 

A 

Iaccessioi 


Neighbor fBlastN vs. 
<i DESCRIPTION 


Genbank) 
PV.UUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


"roteins) 
P VALUE 


679 


1 U50715 


Mus musculus alpha- 
galactosidase A gene. 


f) Dfi4 




HYGROMYCIN-B KINASE 
(HYGROMYCIN B 

PHOSPHOTRANSFERASE) 
(APH(7")) 

>gi|66885|pir||WGSMHH 
hygromycin B 

phosphotransferase (EC 2.7.1.-) 
Streptomyces hygroscopicus 
>gi|581682 (X03615) pot. hyg 
protein [Streptomyces 
hygroscopicus] 
phosphotransferase [synthetic 
construct] >gi|2739064 cloning 
vector] >gi|2739068 
(-AF025747) hygromycin B 
phosphotransferase [unidentified 
clonina vector] 


2.3 


680 


228 1 82 


S.cerevisiae 
chromosome XI 
reading frame ORF 


U.UtH 


1079035 


Om(2D) protein - fruit fly 
(Drosophila ananassae) 
>gi|443770|gnl|PID|d 1006095 
(D26553) ORF 


1.8 


681 


M29917 


Human ornithine 
aminotransferase 1 


U.U04 


^1 i in"* ji 
J.H /yj4 


(U97553) unknown [murine 
herpesvirus 63] 


1.4 


682 


AB020709 


Homo sapiens mRNA 
for KIAA0902 
protein, complete cds 


0.064 


861404 


(U29154) T07FI2.3 gene 
product [Caenorhabditis 
elegans) 


0.47 


683 1 


ABO 14546 


^€omo sapiens mRNA f 
for KIAA0646 { 
protein, complete cds 


0.064 


1708118 


HOMEOBOX PROTEIN HB9 
>gi|507425 ■ 


0.35 


684 1 


AB010427 t 


iomo sapiens mRNAl 
"or NORI-l, complete 
:ds j 


0.064 


( 

2388676 


AF0 15539) procollagen P 
My ti lus edulisj 


0.018 


685 1 


( 
r 

r 

r 

U34774 c 


Drf virus ankyrin-Iike 1 
epeat protein. Fl 1L 
omolog. and F12L 
omolog genes, 
omplete cds. 


0.064 


j 
P 

731668 c 


•SFl PROTEIN 
gi|626624|pir||S46700 SSF1 
rotein - yeast (Saccharomyces 
erevisiae) 


le-05 


686 1 


h 
n 
n 

AF022861 s 


'lus musculus 
europilin-2(a5) 
iRNA. alternatively 
pliced. complete cds | 


0.064 1 


( 
d 

409 1978 sj 


A.F073359) benzaldehyde 
ehydroeenase [Pseudomonas 
5. DJ77] 


le-05 



1A6 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor fBlastN vs. Genbank) 



ACCESSION 



687 I U14331 



DESCRIPTION 



P VALUE 



Sus scrofa myogenin 
gene, complete cds 



0.064 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



(AC0Q4010) similar to Leucine? 



P VALUEl 



2781386 



rich transmembrane proteins; 
44% similarity to U42767 
(PID:gI736918) [Homo 
sapiens] 



3e-33 



688 I AF074870 



Chironomus 
pallidivittatus clone 
1219 non-telomeric 
Ssp repeat sequence 



0.063 



689 I 225523 



H.sapiens repeat 
reeion DNA. 



<NONE> 



<NONE> 



<NONE> 



690 I AE001378 



Plasmodium 
falciparum 
chromosome 2, 
section 15 of 73 of 
the complete 
sequence ' 



0.063 



<NONE> 



<NONE> 



<NONE> 



0.063 



691 | 272947 



cerevisiae 
chromosome VII 
reading frame ORF 
YGR162w 



692 



Y 14723 



Choanomphalus 
incertus 
mitochondrial 
cytochrome c oxidase 
subunit I gene, partial 



0.063 



693 



X74103 



694 | AF039843 



Streptomyces sp. 
gene for alkaline 
serine protease I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



0.063 



0.063 



Homo sapiens 
Sprouty 2 (SPRY2) 
mRNA. complete cds 



0.063 



<NONE> 



cNONEs 



1730713 



|H I PU 1 Hh 1 1LAL 1U8.5 KJJ 
PROTEIN IN UME3-PUB 1 
INTERGENIC REGION 
|>gi|2131866|pir||S62935 
hypothetical protein YNL023c 
yeast (Saccharomyces 
'cerevisiae) 

>gi|1301855|gnl|PID|e239870 

!(Z71299) ORF YNL023c 

[Saccharomyces cerevisiae! 
~— THtD 



<NONE> 



232217 



GLUIAIHIUNES - 
TRANSFERASE GST-6.0 
(GSTB1-1) 
>gi|42I19S|pir||S29772 
glutathione transferase (EC 
2.5.1. IS) - Proteus mirabilis 
>gi|2l26142|pir||S71882 
glutathione transferase (EC 
|2.5.I.I8) B - Proteus mirabilis 
ji| 1053076 (U38482) 



6.7 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


i| Nearest 
> 

ACCESSIOf 


Neighbor fBlastN vs. 
«I DESCRIPTION 


Genbank) 
P V.ALUE 


Nearest Neiqh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


toteins) | 

p valueI 


695 


M63650 


Mouse M-twist gene 
mRNA. complete cds 


0.063 


1730141 


"HUULkXMfcNl'AL 

RETARDATION SYNDROM* 

RELATED PROTEIN 2 

>gi|2135129|pir)|S60173 fragile 

X mental retardation syndrome 

related protein - human 

3ojt 1 008*17 a ji i cm \ -v 
^ginwooj/ \\jjLjYji) rrague / 

mental retardation syndrome 

related protein [Homo sapiens] 


1.8 I 


696 


Y I 3298 


Homo sapiens GDP 
dissociation inhibitor 
beta pseudoeene 


0.063 


1085930 


hypothetical protein 4 - fowl 
adenovirus 1 


1.3 1 


697 


X5660O 


Rat SOD-2 gene for 
manganese- 
containing superoxide 
dismutase 


0.063 


3882143 


(AB01S254) KIAA07 11 protein 
fHomo sapiens) 


0.60 1 


698 


Z23107 


M.musculus mRNA 
for 5HTx serotonin 
receptor 


0.063 


1708162 


HUNTINGTIN 

(HUNTINGTON'S DISEASE 
PROTEIN HOMOLOG) (HD 
PROTEIN) 


0.45 1 


699 


M20670 


Plasmodium vivax 
circumsporozoite 
protein gene. 3' end. 


0.063 


4033395 


DNA GYRASE SUB UNIT B 
subunit [Mvxococcus xanthus] 


0.35 1 


700 


• 

Z62997 


H.sapiens CpG DNA, 
:lone 76gl 1, reverse 
•ead cpg76gi I.rtla . 


0.063 


1350911 


Ktlh\UIL' AL1U RkLkKlUK 

RXR-BET\ sarjiensl 
>gi|3 172498 (AF065396) 
retinoic X receptor B 
dJI033BIO.ll (Retinoid X 

t^wtj^iui ucui ^iv.uuj^ |_nomo 
sapiens) >gi|4249766 

AF120161) retinoic X receptor 
jeta 


0.16 1 


701 


J 
1 

U95094 c 


Xenopus laevis XL- 
NCENP (XL- 
NCENP) mRNA. 
•omplete cds 


0.063 


( 

29SI200 s 


AF04S732) cyclin T2b [Homo 
apiensl 


0.090 1 


702 


■ 

r 
F 

U95098 r 


<enopus laevis 
nieotic 

hosphoprotein 44 
nRNA, panial cds 


0.063 


( 

3877951 C 


ZS1555) predicted using 
jenefinder 


6e-07 1 


703 


> 
I 
I 

U95094 c 


Cenopus laevis XL- 
VCENP (XL- 
N'CENP) mRNA. 
omplete cds 


0.063 


( 

33930 IS p 


•\L03 1 1 74) hypothetical 
rotein 


2e-10 | 



^ 1 
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SEQ 
ID 



704 



705 



706 



707 



708 



709 



710 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



711 



712 



713 



DESCRIPTION 



D90872 



M25528 



U45256 



U95I02 



AF044317 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



E.coli genomic DNA 



Kohara clone 
#419(54.7-55.1 min. 



M.crystallinum 
ferredoxin-NADP+ 
reductase (fhr A) 
mRNA. complete cds 



Strongyloides ratti 
microsatellite B DNA 



Xenopus lac vis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 



Z73975 



X54232 



X03O73 



Y12573 



Homo sapiens 
TEL/AML1 fusion 

ene, partial sequence 

aenorhabditis 
elegans cosmid 
T06ES, complete 
sequence 
[Caenorhabditis 
elegans] 



.ACCESSION 



DESCRIPTION 



0.063 



2498198 



0.062 



<NONE> 



CYTOCHROME B561 
(CYTOCHROME B-56Q 



. <NONE> 



0.062 



<NONE> 



0.062 



<NONE> 



0.062 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



Human mRNA for 
heparan sulfate 
proteaglycan 



0.062 



3108187 



Bovine retinal mRNA 
for transducin bcta- 
subunit 



0.062 



1076741 



0.062 



D.mclanogaster Jun 
and 14-3-3 zeta gene 



Bom bus terrestris 
mitochondrial 
cytochrome oxidase I, 
L26573 partial cds. 



477578 



(AC004663) Notch 3 [Homo 
sapiens] 



P VALUE I 



chitinase (EC 3.2.1.14) 
precursor - rice precursor - rice 
>gi|S07955 (X87109) chitinase 
fOryza saliva] 



3e-19 



<NONE; 



<NONE> 



<NONE> 



<NONE> 



2.9 



0.59 



sialidase - Actinomyces viscosus 



0.062 



3879551 



0.062 



1684959 



>gi|141352 



(Z70756) similar to collagen 



(U20600) NADH 
dehydrogenase subunit [Vanda 
lamellata] 



0.087 



0.073 



0.039 
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SEC 
ID 


^1 Neares 

2 

Iaccessioi 


Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neipt 
ACCESSION 


lbor (BlastX vs. Non-Redundant F 
DESCRIPTION 


*roteins) S 

p value! 


714 


I U58994 


Human ladinin CLAD 
gene, complete cds 


0.062 


2811078 


AMlNUPhH I LUA5ETB 

~*AUUNVL 

AMINOPEPTIDASE) 
(ARGININE 
AMINOPEPTIDASE) 
(CYTOSOL 

AMINOPEPTIDASE IV) (AP- 
B)>gi|2039I43(U61696) 
aminopeptidase B [Rattus 
norvcgicusl 


9e-06 J 


715 
716 


1 ABO 14553 


Horho sapiens mRNA 
tor KIAA065j 
protein, partial cds 


0.062' 


1326350 


(U58748) similar to potential 
transmembrane domains in S. 
cerevisiae nulcear division 
RFT1 protein (SP:P38206) 


5e-10 J 


L16898 


Mus musculus 
collagen alpha 1 type 
XVin mRNA. 5'end. 


0.062 


1723657 


H \ PUl HE 1 1LAL 38.3 KB 

PROTEIN IN ERV1-GLS2 
INTERGENIC REGION 
>gi|21325S7|pir||S64322 
probable membrane protein 
YGR031 w - yeast 
(Saccharomyces cerevisiae) 
>gi| 1 3230 1 0|gnl |PID|e243277 
(Z72816) ORF YGR031w 
[Saccharomyces cerevisiael 


le-14 I 


717 


X99343 


M. tuberculosis 
guaA/B & choD 
genes 


0.06Z 


3873807 


(249907) B 049 1.1 
[Caenorhabditis eleeans] 


2e-19 I 


718 J 


AF010193 ( 


Homo sapiens MAD- 
related gene SMAD7 
r SMAD7) mRNA. 
:omplete cds 


0.061 


<NONE> 


<NONE> 


<NONE> 1 


719 I 


1 

L10182 r 


Vlyrmeleon sp. 1SS 
ibosomal RNA. 


O.061 




<NONE> 


<NONE> I 


720 


C 
i 

D 
C 

Y 14723 s 


Ihoanomphalus 
ncertus 
nitochondrial 
ytochrome c oxidase 
ubunit I gene, partial 


0.061 


<NONE> 


<NONE> 


cNONE>| 


721 | 


B 

s 
n 

L27840 a 


ovine respiratory 
pncytial virus 
ucleoprotein mRNA. 
smpletc cds. 


0.061 


542955 n 


icleoporin p62 - human 


8.6 1 
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SEQ 
05 



Nearest Neighbor (BlastN vs. Gcnbank) 



ACCESSION I DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 



ACCESSION 



722 



U95094 



IXenopus laevis XL- 
llNCENP (XL- 
llNCENP) mRNA, 
[complete cds 



723 



U95098 



724 | U26463 



IXenopus laevis 
I mitotic 

Iphosphoprotein 44 
[mRNA, partial cds 
jSporidiobolus 
Isalmonicolor 
NADPH-dependent 
laldehyde reductase 
Igene, complete cds 



0.061 



0.061 



0.061 



DESCRIPTION 



Sus scrota 



494454 



>Si|4y443D|pdb|lPO!>|U Sus 

scrofa Sus scrofa 
>gi|14212I0|pdb|lPCP| Porcine 
Spasmolytic Protein (Psp) (Nmr, 
19 Structures) Spasmolytic 
Polypeptide 

>gi| 1 63306 l|pdb|2PSP|B Chain 
B. Porcine Pancreatic 
Spasmolytic Polypeptide 



3845272 



P VALUEl 



1710288 



(AE00I417) hypothetical 
protein [Plasmodium 
falciparum! 



2.9 



(U79302; unknown [Homo 

sapiens] 
i j j 1 1 1 1 1 1 



1.3 



725 j AF035443 



Xenopus laevis wee I 
homolog mRNA. 
complete cds 



0.061 



3979720 



EiVIBL:D33048 comes from this 
gene; cDNA EST 
EMBL:D35780 comes from this 
gene; cDNA EST yk442c6.3 
comes from this gene; cDNA 
EST yk442c6.5 comes from this 
gene; cDNA EST yk398f6.3 
comes from this gene; cDNA 
E... 

>ai|39798 1 6|gnl|PID|e 1 3583 15 
EST EMBL:D35780 comes 
from this gene; cDNA EST 
yk442c6.3 comes from this 
gene; cDNA EST yk442c6.5 
comes from this gene; cDNA 
EST yk398f6.3 comes from this 
gene; cDNA E 



726 



Z48584 



Caenorhabditis 
elegans cos mid 
ZK1 321, complete 
sequence 
[Caenorhabditis 
elegans] 



0.061 



3183491 



HYfWHETTCAE 41s" iCb — 
PROTEIN C27F2.7 IN 
CHROMOSOME III 

10655 10 (U40419) C27F2.7 
gene product [Caenorhabditis 
elecans] 



3e-ll 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Neares 
ACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUI 


_ J Nearest Neigl 
I ACCESSION 


>ibor (BlastX vs. Non-Redundant 1 
DESCRIPTION 


°roteins) 
P VALUE 


727 1 X61489 


Zea mays pep gene 
for (C3 type) 
phosphoenolpyruvat< 
carboxylase 


0.061 


2496887 


ti V PO i'RETICaL J2.0 KB 
PROTEIN C09H.2 IN 
CHROMOSOME HI 
>gi|732538 (U22832) C09F5.2 
gene product [Caenorhabditis 
elegansj 


le-15 | 


728 I AJF025408 


Drosophila 
melanogaster 
Windbeutel (wind) 
gene, complete cds 


0.061 


3702295 


(AC0O5783) R33083_l [Homo 
sapiens] 


2e-60 J 


729 J AB012106 


Brassica rapa mRNA 
for SRK45. complete 
cds 


0.060 


<N0NE> 


<NONE> 


<NONE> 1 


730 I AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 


0.060 


<NONE> 


<NONE> 


<NONE> 1 


731 1 Y08682 


H.sapicns mRNA for 
carnitine 

palmitoyltransferase I 
type I 


0.060 


3319446 


(A5077541) contains similarity 
to class-I aminoacyl-tRNA 
synthetases [Caenorhabditis 
eleaans] 


8.1 1 


1 732 I U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA. 
complete cds 


0.060 


1041 U9 


(D78016) TRAE [Enterococcus 
faecal is I 


8.1 1 


733 1 AF064030 


Helianthus tuberosus 
ectin 2 mRNA. 
;omplete cds 


0.060 


632209 


regulatory protein Rex - primate 
T-lymphotropic virus PTLV-L 
fragment) 


3.7 1 


1 & 
1 1 

734 1 AF100694 c 


Vlus musculus 
f> ontin52 mRNA, 
:omplete cds 


0.060 


3098348 f 


AF037401) neuropeptide 
V/peptide YY receptor Yc 
Danio reriol 


2.1 I 


1 1 7 
> 1 r 

1 P 

735 1 U95102 n 


Cenopus laevis 
nitotic 

hosphoprotein 90 
iRNA. complete cds 


0.060 


I 

( 
f 

> 

1< 

P 

12597S > 


-4k PKOTHIN PkkC'UkiiOR 
LEUKOCYTE ANTIGEN 

"-Ln 1 CL/J 

•gi|70146|pir||TDHULK 
;ukocyte antigen-related 
rotein precursor - human 
;;i|34267 sapiens) 


1.2 J 


1 S 

I P 

736 1 U76523 c 


ambucus niara lectin 
recursor mRNA. 
Dmplete cds 


0.060 


(I 
rs 

2055394 n 


J87306) transmembrane 
ceptor UNC5H2 [Rattus 
Jrveaicus] 


0.32 1 


1 H 

I ' ct 

737 1 U69668 pi 


uman nuclear pore 
implex-associated 
otein TPR 


0.060 


4127854 ar 


'14063) ChTl thymocyte 
liyen [Gall us gallus] 


9e-04 | 
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£ Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 1 


SEQ 
ID 


ACCESSIOI* 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


Jp value) 












(U58748) similar to potential 




738 


ABO 14553 


Homo sapiens. mRN/- 
for KIAA0653 
protein, partial cds 


0.060 


1326350 


transmembrane domains in S. 
cerevisiae nulcear division 
RFT1 protein CSP-P38206'> 


1 f.C\Q I 


739 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial eds 


0.060 


2632098 


(Y15513) Prodos nrotein 
_ fDrosophila mejanogaster] 


5e-10 I 


740 


Z96260 


H.sapiens telomeric 
DNA sequence, clone 
12QTEL 101, read 
12QTELOO10t.seq 


0.059 


<NONE> 


<NONE> 


<NONE> I 


741 


M93128 


Mouse homeobox 
protein (EVX2) 
mRNA, complete L-ds. 


0.059 


<NONE> 


<NONE> 


<NONE>| 


742 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.059 


1652318 


(D90904) lysostaphin 
fSynechocvstis sp.] 


4.7 1 


743 


AB 007920 


Homo sapiens mRNA 
for KIAA045 1 
protein, complete cds 


0.059 


479491 


transcription factor brn-3b - 
human 


0.71 J 


744 


M60445 


Human histidine 
decarboxylase (HDC) 
mRNA, complete cds 


0.058 


<NONE> 


<NONE> 


<NONE> 1 


745 


U01836 ( 


Jstilago maydis 
:xodeoxyribonucleas 
i(RECl) gene. 
:omplete cds. 


0.058 


1 171903 


ULlUUPhKI'IDf 
TRANSPORT SYSTEM 
PERMEASE PROTEIN OPPC 
>gi|1075086|pir||D64184 
oligopeptide transport system 
permease protein (oppC)C 
"lomolog - Haemophilus 
nfluenzae (strain Rd KW20) 
permease protein (oppC) 
Haemophilus influenzae Rd] 


1.5 1 


746 


I 

c 
c 
s 

( 

AF090 U5 c 


-ycopersicon ■ 
:sculentum cyiosolic 
lass II small heat 
hock protein HCT2 
HSP17.4) mRNA. 
omplete cds 


0.053 


( 

3193265 s 


AF06913 1) chitinase [Bacillus 
ubtilisl 


0.002 J 


747 


E 
f 

AB0121O5 c 


irassica rapa mRNA 
or SLG45. complete 
ds 


0.057 


( 

4333S5 i 


U03978) dynein heavy chain 
.otvpe 7A [Tripneustes STatilla] 


3.4 1 
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ij Neares 
) 

ACCESSIOf 


Neighbor (BlastN vs. 
>* DESCRIPTION 


Genbank) 
P VALUE 


Nearest Ncigr. 

ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) 
P VALUE 


748 


AJ0O5813 


Arabidopsis thuliana 
mRNA for 
neoxanthin cleavage 
enzvme 


0.056 


<NONE> 


<NONE> 


_ <NONE> j 


749 


Y 16828 


Lagopus lagopus 
genomic 
microsaiellke 
sequence. LLST4 


0.056 


3328678 


(AE0OI299) hypothetical 
protein fChlamydia trachomatis] 


4.3 1 


750 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.055 


<NONE> 


<NONE> 


<NONE>| 


751 


AF074385 


Sambucus nigra 
hevein-Iike protein 
mRNA, complete cds 


0.055 


137339 


69 KD PROTEIN 
>gi|94375|pir|IS 19150 
hypothetical protein, 69K - 
turnip yellow mosaic virus 


0.69 I 


752 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.035 


<NONE> 


<NONE> 


<NONE>| 


753 


M92069 


Human retrovirus- like 
scquence-isoleucine c 


0.034 


<NONE> 


<NONE> 


<NONE>| 


754 


S78516 


G lL=ankyrin- 1 i ke 
repeat [orf virus OV. 
NZ2, Genomic. 1608 
nt] 


0.033 


2800465 


( AF043700) contains similarity 
to human RNA-binding protein 
FUS/TLS (SW:Q2S009) 
Caenorhabditis elezans] 


0.15 I 


755 


( 
r 

MI 5646 c 


-hicken myosin 
ilkali light chain 
nRNA, complete cds, 
lone pFI. 


0^027 


1 

c 

3334221 c 


X- 

4YDROXYPHENYLPYRUVA 
PE DIOXYGENASE 4- 
lydroxyphenylpyruvate 
lioxygenase [Mycosphaerella 
raminicola] 


6e-17 


756 


/ 
c 
c 
B 

AF027174 c 


Arabidopsis thaliana 
ellulosc synth;ise 
atalytic subunit (Ath- 
) mRNA. complete 
ds 


0.025 


( 

3877815 C 


296048) predicted using 
enefinder 
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SEC 
ID 


i Nearest 
I 


Neighbor (BlastN vs. 
H UfcoCKLr 1 ION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
j DESCRIPTION 


"Totems) 
P VALUE 


757 


I AJ002291 


Streptococcus 
pneumoniae pbplb 
gene, complete 


0.025 


38804S7 


{^oaui-*) similar to nbose- 
phosphate pyrophosphokinase; 
cDNA EST EMBL:D73173 
comes from this gene; cDNA 
EST EMBL:D70909 comes 
from this gene; cDNA EST 
EMBL:D73449 comes from this 
gene; cDNA EST 
EMBL:D76167 comes from this 
ge... 


1.7 


758 


X79104 


C.botulinum (NCTC 
7272 type A) HA-33 
and P-21 genes. 


0.024 


2648615 


(AE000970) tungsten 
formylmethanofuran 
dehydrogenase, suburut B (fwdB 
2) (Archaeoalobus fulgidusl 


6.1 


759 


U95I02 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 


0.024 


1663698 


(D83785) expressed 
ubiquitously; product similar to 
D.melanogaster mam protein. 
Homo sapiens] 


4.7 


760 I 


U36197 


Chlamydomonas 
reinhardtii cobalamin- 
independent 
methionine synthase 
mRNA. complete cds 


0.024 


585723 


PEROXISOME 
PROLIFERATOR 
ACTIVATED RECEPTOR 
GAMMA (PPAR-GAMMA) 
>gi|2S3SlS|pir||C42214 
peroxisome proliferator- 
activated receptor gamma chain ■ 
African clawed frog >gi|2 14668 
>IS4163) peroxisome 
proliferator activated receptor 
aamma fXenopus laevis] 


0.42 


761 J 


■ 

( 

1 

( 

L38865 i 


Vlacaca mulatto, 
clone MMVA63) T- 

rpi*^ntnr ilnhi 

TCR A) mRNA. 
>artial cds. 


0.023 


<NONE> 


<NONE> 


<NONE> 


762 1 


^ 

AF035948 r 


Aixs musculus insulin 
eceptor substrate-3 


0.023 


< 
s 

2500587 S 


JPLICEOSOME 
ASSOCIATED PROTEIN 49 
pliceosome-associated protein 
AP-49 - human >gi|556217 


0.40 


763 1 


S 
f 
P 

X98890 tt 


.tuberosum mRNA 
ar inorganic 
hosphate 
ansporter, StPTI 


0.023 


P 

1 10072 rr 


roline-rich protein MP4 - 
louse >si|531S2 


0.1S 



WO 01/02568 



PCT/US00/18374 



pS(m-l Neares 

IseqI 

n> Iaccessio 


t Neiehbor fBlastN vs. 
N DESCRIPTION 


Genbank) 

p value 


Nearest Neisl 
ACCESSION 


lbor (BlastX vs. Non-Redundant 1 
DESCRIPTION 


P VALUE 


1 764 1 X91212 


L.esculentum mRN/ 1 
for HD-ZIP protein 


0.022 


[ <NONE> 


<NONE> 




J 765 J AC004498 


Homo sapiens 
chromosome 5. PI 
clone 1209C1 (LBN1 
HI 04), complete 
sequence [Homo 
sapiens] 


0.022 


1 <NONE> 


<NONE> 


- <NONE> 
<NONE> 


1 766 1 U07083 


Human prostatic acid 
phosphatase (ACPP) 
gerie. exon 1 




1 <NONE> 


<NONE> 


<NONE> 


767 1 X98890 


S.tuberosum mRNA 
for inorganic 
phosphate 
transporter. StPTl 


) 0.022 


1 <NONE> 


<NONE> 


<NONE> 


768 j X56488 


L.esculentum LAT59 
gene 5'flanking 
region, expressed 
during pollen 
maturation 


0.022 


<NQNE> 


<NONE> 


<NONE> 


1 769 1 M34651 


Pseudorabies virus 
with upstream and 
downsteam 
sequences. 


0.022 1 


<N0NE> 


<NONE> 


<NONE> 


770 X66727 


P.taeda gene tor 
pro tochlorophy Hide 
reductase 


0.022 1 


3878517 • 


(Z92S06) K10G4.4 
Caenorhabditis elegansl 


4.3 


1 1 1 

1 ' 

77i U95102 r 


Xenopus laevis 
■nitotic 

Jhosphoprotein 90 
nRNA. complete cds 


0.022 1 


( 
F 

1854452 s 


D8950I) similar to salivary 
>roline-rich protein P-B [Homo 
apiens] 




1 ' 
1 1 r 

1 F 

772 1 U95098 „ 


tenopus laevis 
nitotic 

hosphoprotein 44 
iRNA. partial cds 


0.022 


( 

3021699 s 


<\B005298) BAJ 2 [Homo 
apiens] 


f . j 
0.64 


773 


h 
fl 

X71932 1 


.sapiens XB gene 
3r tenascin-X. intron 
4 


0.022 1 


li 
P 
> 

627059 a 


ver stage antigen LSA-1 - 
lasmodium falciparum 
gi|99(6 (X56203) liver stage 
itigen 


0.05S 


774 


C 

X87369 w 


.perfringens nanH 
:ne & ORF1.2.3 & 4 


0.022 j 


(I 

2062407 Hi 


J7S975) poIy(ADP-ribose) 
ycohydrolase [Bos tauras] 


0.056 
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=1 Ncarcsi 


Neighbor iBIastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins* 


SEC 
ID 


) 

ACCESSIOr 


4 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


775 


Y14971 


Gall us gallus rnRNA 
for K60 protein 


0022 


134091 


U I SMALL N UCLKSH 

k WON UCLEOPRO'l'^lN 70 

KD (U 1 SNRNP 70 KTV» 
>gi|S5864|pir||S02016Ul 
snRNP 70K protein - African 
. clawed frog >gi|65179 
(X12430) Ul 70K [Xenopus 
lac vis] 


0.032 


776 


AF0O3133 


Caenorhubditis 
elegans cosmid 
T21E3 


0.022 


1 7nqQQ"7 


DNA REPAIR PROTEIN 
RAD18 >gi|H50622 protein 
radI8 [Schizosaccharomyces 
pombe] 


2e-08 


777 


. AF003133 


Caenorhabditis 
elegans cosmid 
T21E3 


0.022 


1 7flQQQ7 


DNA REPAIR PROTEIN 
RAD 18 >gi|l 150622 protein 
radl8 [Schizosaccharomyces 
pombe) 


2e-08 


778 


U57645 


Human helix- loop- 
helix proteins [d- 1 
(ID-1) and Id-1' (ID- 
1) genes, complete 
cds 


0.021 


<NONE> 


<NONE> 


<NONE> 


779 


U67570 


Methanococcus 
jannaschii section 112 
of 150 of the 
complete aenome 


0.021 


<NONE> 


<NONE> 


<NONE> 


780 


L01584 


Trypanosoma cru2i 
calcium-binding 
protein (CUB2.8) 
gene, complete cds. 


0.021 


<NONE> 


<NONE> 


<NONE> 


781 


L04787 


Borrelia hermsii outer 
membrane lipoprotein 


0.021 


<NONE> 


<NONE> 


<NONE> 


782 


TTQSnOzl , 


Xenopus laevis XL- 
INCENP(XL- 
QNCENP) mRNA. 
:omplete cds 


0.021 


<NONE> 


<NONE> 


<NONE> 


783 


c 
r 
t 

( 
t 

< 

0 

c 

L36890 o 


laccharomyces 
•crevisiae 
nitochondrion 
ransferRNA-Thrl 
tRNA-Thr) gene; 
ransferRNA-Val 
IRNA-Val) gene; 
xi2 gene, complete 
ds; ORF2 and origin 
f replication (orii). 


0.021 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlasiN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



784 | M76741 



785 I M87504 



786 | U94346 



DESCRIPTION 



P VALUE 



ACCESSION 



Homo sapiens biliary 
glycoprotein (BGP) 
gene, partial cds. 



Tetrahymena 
thermophila histone 
H3 (HHT2) gene. 
complete cds 



Human calpain-like 
protease (htra-3) 
mRNA. complete cds 



0.021 



0.021 



0.021 



DESCRIPTION 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



P VALUE 



<NONE> 



<NONE> 



<NONE> 



L01584 



Trypanosoma cruzi 
calcium-binding 
protein (CUB2.S) 
;ene. complete cds 



0.021 



<NONE> 



<NONE> 



<NONE> 



788 I U36S30 



ongo pygmaeus C I' 
microsatellite. clone 
#1, from the tandemly 
repeated genes 
encoding U2 small 
nuclear RNA (RNU2 
locus) 



0.021 



<NONE> 



<NONE> 



<NONE> 



X03833 



Human gene tor 
interleukin 1 alpha 
(H.-I alpha) 



0.021 



416974 



EARLY TRANSCRIPTION 
FACTOR 70 KD SUB UNIT 



8.9 



790 | U20806 



Dictyosielium 
discoideum guanine 
nucleotide-binding 
protein alpha subunit 
5 (G alpha 5) gene, 
complete cds. 



0.021 



1401211 



(U585I0) RNA helicase 
homolog [Chlorarachnion 
CCMP6211 



8.8 



791 | Z59258 



H.sapiens CpG DNA 
clone 13d2. reverse 
read cp°13d2.rtlc . 



792 [ AF030692 



Plasmodium 
falciparum strain 7GS 
chloroquine 
resistance candidate 
protein (cg2) gene. 
complete cd: 



793 1 U67570 



Methanococcus 
annaschii section 1 1'. 
f 150 of the 
omplete genome 



0.021 



3121732 



ACONITATE HYDRATASE 
(CITRATE HYDRO-LYASE) 
(ACONTTASE) >gi|2183256 
(AF002133)aconitase 
[Mycobacterium avium] 



0.021 



0.021 



3024190 



NINt PROTEIN" 

gi|2120251|pir||S66581 
hypothetical protein 56 - phage 
S2>gi|105U14(X92588) 
orf56; related to nin60 (ninE) of 
bacteriophage lambda 



7.0 



5.8 



2341037 



AC000r04) F19P19.17 
fArabidopsis thaliana] 



4.0 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



794 



795 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION | DESCRIPTION [ p VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteinsl 



D86566 



Human DNA for 
NOTCH4, partial cds 



LU648 



IStreptomyces 
coelicolor sigma 
[factor (rpoX) gene, 
Icomplete cds. 



ACCESSION 



DESCRIPTION 



0.021 



1708619 



0.021 



79833 



NUCLEAR FACTOR NT- 



KAPPA- B PI 00 SUB UNIT 
(H2TF1) (ONCOGENE LYT- 
10) (LYT10) [CONTAINS: 
NUCLEAR FACTOR NF- 
KAPPA-B PS2 SUBUNIT1 



hypothetical 119.5K protein 
(uvrA region) - Micrococcus 
luteus 



P VALUEl 



3.1 



796 I U95094 



797 



798 



IXenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA. 

Icomplete cds 



0.021 



iRattus norvegicus 
Imicrotubule- 
U30938 associated protein 2 



128000 



D82364 



IChicken mRNA tor 
TSC-22 variant, 
complete cds, clone 
JSLFEST52 



0.021 



468600 



Callus gallus eHAND 
799 1 U40Q4I ImRNArcompkte cds 



'NEUROENDOCRINE 

CONVERTASE I 
PRECURSOR (NEC 1) (PCI) 
(PROHORMONE 
CONVERTASE 1) propeptide 
processing protease [Mus 
cookiil 



(X74416)beta-3 integrin 
fTakifugu rubripes] 



0.021 



693723 



0.021 



3449308 



27 kda amelogenin 
f alternatively spliced) 



(AB01 1541) MEGF8 [Homo 
s apiens] 



1.0 



1.0 



0.61 



800 



X71932 



IH.sapiens XB gene 
Jfor tcnascin-X, intron 
14 



lOryza sativa 24- 
methylene lophenol 
C24(l)methyltransfer 
lase mRNA, complete 
801 | AF042333 cds 



0.021 



627059 



liver stage antigen LSA- 1 - 
Plasmodium falciparum 
>gi|9916 (X56203) liver stage 



0.021 



854065 



802 | L37380 



|Rat apical endosomal 
(glycoprotein mRNA. 
[complete cds. 



Caenorhabditis 
elesans cosmid 
803 | AF003I33 It21E3 



(XS3413) U88 [Human 
herpesvirus 61 



0.021 



3334377 



0.021 



1709997 



TRANSMEMBRANE 
PROTEASE. SERINE 2 



DNA REPAIR PROTEIN 
RAD18 >gi|l 150622 protein 
radlS [Schizosaccharomyces 
pombel 



0.054 



0.014 



le-05 



3e-0S 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Prnreino 


SEQ 
ID 


ACCESSIOf 


i DESCRIPTION 
Rabbit mRNA tor 


1 P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


804 


X57689 


calcium channel BI-2 
(lambda CBP109 and 
CB101) 


0.021 


2959370 


(AL022 11 7) hypothetical 
protein 


le-10 


805 


U95102 


Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA. complete cds 


0.021 


1109830 


(U41534) coded tor by C. 

Similar to helicases of 
SNF2/RAD54 family. 
[Caenorhabditis elegansl 


5e-U 


806 


X77753 


H.sapiens TROP-2 
gene 






~HYPUIHLHCal38.5 KL> 

PROTEIN IN ER V 1 -GLS2 . 
INTERGENIC REGION 
>gi|2132587|pir||S64322 
probable membrane protein 
YGR03 1 w - yeast 
(Saccharomyces cerevisiae) 
>gi| 1 3230 10|gnl|PID|e243277 
(Z72816) ORF YGR031w 
[Saccharomvces cerevisiael 


5e-ll 


807 


X98890 


S. tuberosum mRNA 
for inorganic 
phosphate 
transporter, StPTl 


0.021 


2137872 


zinc finger protein PZF - mouse 
>ai|453376 


2e-19 


808 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


809 


AJ224935 


Homo sapiens 
Promotor Region and 
PCK2 aene 


0.020 


<NONE> 


<NONE> 


<NONE> 


810 


U76524 


Sambucus nigra 
ribosome inactivating 
uroiein precursor 
nRNA. complete cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


811 


X99941 i 


■Vthaliana GBF1 
»ene 


0.020 


<NONE> 


<NONE> 


<NONE> 


812 


I 
1 

j 
r 
t 

X65138 n 


vf.musculus mRNA 
or tyrosine kinase > 
: gb|S57168|S5716S 
>ek=Eph-reIated 
eceptor protein 
yrosine kinase [mice. 
iRNA. 4242 nt] | 


0.020 


<NONE> 


<NONE> 


cNONE> 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN v S . Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOI 1 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















813 


L04787 


Borrelia hermsii oute 
membrane lipoproteii 


r 

i 0.020 


<NONE> 


<NONE> 


<NONE> 


8 14 


AJ223633 


Enterococcus raeciun 
genes encoding 
enterocin L50A and 
enterocin L50B plus 
5' and 3' flanking 
regions 


> 

0.020 


<NONE> 


<NONE> 


<NONE> 


815 


AB012106 


Brassica rapa raRNA 
for SRK45, complete 
cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


816 


AE001539 


Helicobacter pylori, 
strain J99 section 100 
of 132 of the 
complete genome 


0.020 


172292 


(LI 1895) transmembrane 
protein [Saccharomyces 
cerevisiae] 


8.4 


817 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


0.020 


94173 


pol polyprotein - Chinese 
hamster intracisternal A-particle 
CH1AP34 


8.0 


818 


M55264 


Herpesvirus saimiri 
dihydrofolate 
reductase (DHFR) 
and snRNA (HSUR) 
genes, complete cds. 


0.020 


2924250 


(Z98745) dJ29K1.2 [Homo 
sapiens] 


6.5 


819 


AF052163 


Homo sapiens clone 
24456 mRNA 
sequence 


0.020 


1706288 


1J(4) UUFAMlNh KhLEFlUK 
(D(2C) DOPAMINE 
RECEPTOR) 

>gi|21 194S2|pir||I49246 D4 
dopamine receptor - mouse 
>gi|758427 (U19880) D4 
dopamine receptor [Mus 
musculus] 

>gi| 1095539|prfl|2 109259A 
dopamine D4 receptor [Mus 
musculus] 


4.9 


820 


AF074387 


Sambucus nigra 
levein-Iike protein 
uRNA. complete cds 


0.020 


( 

2113798 | 


2S3259) AmphiBrf38 
Branch iostoma floridae] 


4.7 


821 


I 

AF052163 s 


lomo sapiens clone 
.4456 mRNA 
equence 


0.020 


( 
i 

1 

I 

J 

£ 

3874733 s 


Z6/ /D4) cONA EST 
EMBLT02354 comes from this 
tene; cDNA EST 
iMBL:D32698 comes from this 
ene; cDNA EST 
:MBL:D3541 1 comes from this 
ene 


4.7 



WO 01/02568 



PCT7US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Pmimnsl 


SEQ 
ED 


ACCESSIOI*. 


I| DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















822 


H1002 


Rat ankyrin binding 
glycoprotein- 1 relate< 
mRNA sequence. 


i 

0.020 


552132 


(K01664) Bkm-like protein 
_ [Drosophila melanogaster] 


3.8 


823 


AE001539 


Helicobacter pylori, 
strain J99 section 10C 
of 132 of the 
complete genome 


0.020 


172292 


(LI 1895) transmembrane 
protein [Saccharotnyces 
cerevisiae] 


3.8 


824 


X98890 


S.tuberosum mRNA 
for inorganic 
phosphate 
transporter. StPTl 


0.020 


3879798 


" (.i-UI 1_U; illlHIJJ LU u iv. 

Domain (2 domains); cDNA 
EST yk390bl0.3 comes from 
this gene; cDNA EST 
EMBL:D71652 comes from this 
gene; cDNA EST yk275f8.3 
comes from this gene; cDNA 
EST yk393b9.3 comes from this 
gene; cDNA EST yk37... 
>gi|3880220|gnl|PED|el349842 
yk390bl0.3 comes from this 
gene; cDNA EST 
EMBL.D71652 comes from this 
gene; cDNA EST yk275f8.3 
comes from this gene; cDNA 
EST yk393b9.3 comes from this 
sene: cDNA EST yk37... 


1.3 


825 


U97519 


Homo sapiens 
aodocalyxin-Iike 
jrotein mRNA, 
complete cds 


0.020 


1345633 


C-l-lb 1 KAHiUKUl-ULAlL 
SYNTHASE. CYTOPLASMIC 
(Cl-THF SYNTHASE) 
(METHYLENE TETRAHYDR 
OFOLATE 

DEHYDROGENASE / 
METHENYLTETRAHYDROF 
OLATE CYCLOHYDROLASE 
C 1-tctrahydrofolate synthase 
Rattus norvesicusl 


0.066 


826 


( 

i 

AF003133 ' 


Caenorhabditis 
slegans cosmid 
r21E3 


0.020 


1709997 i 


DNA REPAIR PROTEIN 
RAD 1 8 >gi| 1 150622 protein 
•ad 18 [Schizosaccharomyces 
3ombe| 


2e-07 


827 


c 
t 
i 

r 

F 

U32857 s 


Saccharomyces 
•erevisiae VAR1 
tene, mitochondrial 
tene encoding 
nitochondrial 
•rotein. 3' processing 
ite. partial sequence 


0.019 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


Nearest 

> 

ACCESSION 


Neighbor (BlastN vs. 
J DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


"roteins) 
P VALUE 


828 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 


0.019 


2506381 


NbUROGLNIC LOCUS 

NOICHHOMOLOG " 
PROTEIN 4 PRECURSOR 
(TRANSFORMING PROTEIN 
INT-3) mammary gene mRNA. 
complete cds.], gene product 
[Mus musculus] 


3.3 1 


829 


AF034099 


Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA. complete cds 


0.019 


3880930 


(ALU2 148 1) similar to 
Phosphoglucomutase and 
phosphomannomutase 
phosphoserine; cDNA EST 
EMBL.D36168 comes from this 
gene; cDNA EST 
EMBL:D70697 comes from this 
gene; cDNA EST yk373h9.5 
comes from this gene; cDNA 
EST EMBLT008... 


6e-15 1 


830 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


0.018 


<NONE> 


<NONE> 


<NONE>| 


831 


U24578 


Human RP 1 and 
complement C4B 
precursor (C4B) 
genes, partial cds. 


0.013 


478673 


proline-rich protein precursor - 
kidnev bean vulgaris] 


3.1 1 


832 


U76523 


Sambucus nigra lectin 
precursor mRNA, 
complete cds 


0.011 


<NONE> 


<NONE> 


<NONE> 1 


833 


( 
1 

1 

c 

U57649 r 


uioenzoruran- 
degrading bacterium 
DPO360 2.j- 
dihydroxybiphenyl 
1 ,2-dioxygenase 
r bphC) gene. 
:omplete cds and 2- 
iydroxy-6-oxo-6- 

lienoic acid 
ivdrolase 


0.011 


<NONE> 


<NONE> 


<NONE> 1 


834 


P 

X 15642 c 


i. mays gene tor 
)hosphoenol py ru vale 
arboxylase 


0.01 1 


<NONE> 


<NONE> 


<NONE> 


835 


C 

X51623 a 


r.elegans collagen 
ene col- 1 3 


0.010 


( 

1695686 [ 


D83706) pyruvate carboxylase 
Bacillus stearothermophilus] 


3.1 


836 


P 
K 

U83656 r< 


-anus norveaicus NF- 
1B gene, prornotor 
:aion 


0.003 


( 

4240195 fl 


■\B02066O) KIAA0853 protein 
-Jomo sapiens] 


10.0 1 



WO 01/02568 PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

E> I ACCESSION 



DESCRIPTION 



837 | AJ222657 



Homo sapiens gene 
encoding retina- 
specific guanylyl 
cyclase 



Nearest Neighbor (BlastX vs. Non-Redundant Prr.r.-in.O 



P VALUE | ACCESSION 



838 I ABO 12 106 



839 I U76524 



840 | AFO 12899 



841 I AF074385 



842 I U48734 



843 | U66669 



844 I D16492 



Brassica rapa mRNA 
for SRK45. complete 
cds 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



DESCRHTION 



PUL POLYPRQTEW 



P VALUE! 



0.008 



417704 



0.008 



544024 



(ORF1A/1B) [CONTAINS:' 
RNA-DIRECTED RNA 
POLYMERASE : HELICASE; 



PROTEASE 1 
LHLUKWk LHANNLL ' 



0.008 



532468 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



Sambucus nigra 
hevein-like protein 
mRNA. complete cds 



0.003 



4101160 



Human non-muscle 
alpha-actinin mRNA. 
complete cds 



-lomo sapiens 3- 
hydroxyisobutyryl 
coenzyme A 

ydrolase mRNA, 
complete cds 



0.008 



1711520 



PROTEIN. SKELETAL 
MUSCLE (CHLORIDE 
CHANNEL PROTEIN 1) (CLC 
1) human >gi|397I43 (Z25587) 
human C1C- 1 muscle chloride 
channel [Homo sapiens] 
>gi|398161 (Z25884) human 
CIC- 1 muscle chloride channel 
[Homo sapiens! • 



(U13643) similar to reverse 
transcriptase; possible 
pseudogene [Caenorhabditis 
elegans] 



(AF002589) cytochrome 
oxidase I [Austrofundulus 
limnaeusl 



SRB-S/9 PROTEIN 
>2il 1334996 



O.0OS 



2829922 



Mouse mRNA for 
PI00 serine protease 
if Ra-reactive factor 
(RaRF). complete cds 



0.007 



<NONE> 



(AC002291) extcnsin 
[Arabidopsis thalianal 



<NONE> 



0.007 



<NONE> 



<NONE> 



7.4 



4.6 



3.8 



2.7 



1.6 



0.11 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 


Nearest 

) 

ACCESSIOI 


Neighbor CBIastN vs. 

>J DESCRIPTION 
Human 


Genbank) 
P VALUE 


1 Nearest Neigh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant F 
| DESCRIPTION 


"roteins) 
P VALUE 


• 845 


1 D90923 


immunodeficiency 
virus type 1 proviral 
DNA for envelope 
glycoprotein, partial 
cds. isolate OSS 


0.007 


I <NONE> 


<NONE> 


<NONE>| 


846 


ABO 11087 


Homo sapiens mRNP 
forKIAA05I5 
protein, partial cds 


0.007 


I <NONE> 


<NONE> 


<NONE>| 


847 


1 AE0O0688 


Aquifex aeolicus 
section 20 of 109 of 
the complete genome 


0.007 


1 <NONE> 


<NONE> 


<NONE> I 


848 


X63723 


B.bovis WC1.1 
mRNA 


0.007 


1 <NONE> 


. <NONE> 


<NONE>| 


849 


' AF074386 


Sambucus ni°ra 
hevein-like protein 
mRNA. complete cds 


0.007 


<NONE> 


<NONE> 


<NONE>| 


850 


J00097 


Human beta 2lobin 
region Alu repetitive 
sequence tvpe T*. 


u.uu / 


<iNUJNfc> 


<NONE> 


<NONE>| 


851 I 


D90923 


Human 

immunodeficiency 
virus type 1 proviral 
DNA for envelope 
glycoprotein, partial 
cds. isolate 03S 


0.007 1 


<NONE> 


<NONE> ■ | 


<NONE>| 


852 1 


U95094 < 


Xenopus laevis XJL- 
[NCENP (XL- 
tNCENP) mRNA. . 
:omplcte cds 


0.007 1 


<NONE> 


<NONE> I 


<NONE>| 


853 1 


X91618 \ 


f.castaneum 
tunchback aene 


0.007 f 


<NONE> 


<NONE> i 


<NONE>| 


854 1 


I 

s 
c 

X03838 r 


tat nontranscribed 
pacer (NTS; 
ownstream of 2SS 
RNA gene 


0.007 | 


<NONE> 


<NONE> 


eNONE>| 


855 1 


F 
ii 

a 

M55049 rr 


partus noruegicus 
nerleukin-2 receptor 
Ipha chain (CD25) 
>RNA. complete cds. 


0.007 1 


<NONE> J_ 


<NONE> 


cNONE>| 



WO 01/02568 



PCT/USOO/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Gcnbank) 



ACCESSION 



DESCRIPTION 



856 | Z643I8 



8S7 | AF027173 



858 | AF027174 



859 I AF012899 



H.sapiens CpG DNA. 
clone 9e2, reverse 
read cpg9e2.rtla . 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



Arabidopsis thaliana 
cellulose synthase 
catalytic subuntt CAth 
A) mRNA, complete 
cds 



0.007 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



0.007 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



0.007 



860 



X95276 



861 



U72396 



falciparum 
complete gene map of 
plastid-like DNA 



Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP17.6 
mRNA. complete cds 



0.007 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



0.007 



0.007 



<NONE> 



<NONE> 



DESCRIPTION 



IP VALUEl 



<NONE> 



<NONE>| 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONT5> 



<NONE> 



<NONE> 



862 I AF 100694 



Mus musculus 
Pontin52 mRNA. 
complete cds 



0.007 



<NONE> 



863 I AB000383 



Leucania seperata 
nuclear polyhedrosis 
virus DNA for pi 3. 
xe, envelope protein. 
complete cds 



0.007 



«cNONE> 



864 I D86566 



Human DNA for 
NO TCH4. partial cds ; 



0.007 



<NONE> 



865 



U76524 



Sambucus nigra 

bosorne inactivating 
protein precursor 
mRNA. complete cds 



0.007 



<NONE> 



7-3-5 



<NONE> 



I <NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> I 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


f: Nearest 
J 

ACCESSIOI 


Neighbor (BlastN vs. 
«J DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


toteins) 5 
P VALUE 


866 


1 AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA. complete 
cds 


- 

0.007 


3047072 


(AF058825) No definition line 
found [Arabidopsis thaliana] 


8.9 1 


867 


1 AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 


0.007 . 


975754 


(U29359) SpaO [Salmonella 
enterica] 


8.6 1 


868 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


1213557 


(U50199) coded for by C. 
elegans cDNA yk89e9.5, coded 
for by C. elegans cDNA cm7g5; 
coded for by C. elegans cDNA 
cml4b9; coded for by C. 
elegans cDNA yk52g5.5; coded 
for by C. elegans cDNA 
yk76e5.5; coded for by C. 
elegans cDNA ykl31fl 1.5: c... 


8.4 I 


869 1 


1 

i 

AB012106 c 


3rassica rapa mRNA 
or iKK.43, complete 
ds 


0.007 


i 

i 

2499568 r 


I'MJ 1 EUS-L- ' 

ISOASPARTATE(D- 
ASPARTATE) O- 
METHYL TRANSFERASE 
(PROTEIN- BET A- 
AS PART ATE 

METHYLTRANSFERASE) 
(PIMT) (PROTEIN L- 
ISOASPARTYL/D- 
ASPARTYL 

VOETH.YLTRANSFERASE) 
Tiethyltransferase [Drosophila 
nelanogaster] >gi|1171337 
nelanosaster] 


8.3 I 


870 1 


I 
r 

AF093268 c 


iattus norvegicus 
omer- 1c mRNA, 
omplete cds 


0.007 


( 

4092077 r 


AF095353) toll-like receptor 4 
nutant [Mus musculus] 


6.2 1 


871 J 


S 
h 

AF074386 n 


ambucus nigra 
evein-like protein 
iRNA, complete cds 


0.007 


( 

151377 r 


M80653) tctraheme 
'seudomonas stutzcri] 


6.2 


872 1 


B 

L42319 S 


os taurus (clone 
a!3.8) tristetraprolin 


0.007 


T 
T 

2507337 R 


R. INSCRIPTION 
ERMINATION FACTOR 
HO 


5.5 
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- Nearest 


Neighbor f BlasiN vs. Cenbankj 


_ Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSIOh 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















873 


M59815 


Human rfimnlpmpnr 

component C4A 
gene, e.xons 10 
throuzh 4 1 . 


0.007 


3876769 


i ZJ&Jt) 5 i t Nimtlnnrv tc\ Human 

Prolyl 4-hydroxylase alpha 
subunit (SW:P4HA_HUMAN); 
cDNA EST yk219gl2.5 conies 
from this gene; cDNA EST 
yKJ tyaao comes from this 
gene; cDNA EST yk339dl 1.5 
comes from this gene; cDNA 
ESTyk371c9.3... 


5.3 


874 


X63723 


B.bovis WC1.I 
mRNA 






(AJ001858) human SHVI2 
Homo sapiens] 


5.3 


875 


AB009864 


Expression vector 
pMEI8S-FL3, 
complete sequence 


u.UU/ 


2137618 


p45 NF-E2 related factor 2 - 
mouse musculusl 


5.1 


876 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


2804497 


(AF043705) contains similarity 
to C2H2-type zinc fingers 


5.0 


877 


U95102 


mitotic 

phosphoprotein 90 
mRNA. complete cds 


0.007 


440298 


(L27469) product of alternative 
splicing [Drosophila 
melanogaster] 


4.7 


878 


X58869 


Chicken mRNA for 

aldehyde 

dehydrogenase 


0.007 


1185062 


(L75945) flagellar export 
protein [Borrelia burgdorferi] 


4.1 


879 


AF027735 


Nephila ciavipes 
minor ampullate silk 
protein MiSp 1 
mRNA. partial cds 


0.007 


2394390 


(AF017434) pmi-like gene 
product [Methylobacterium 
extorquens] 


4.0 


880 


AF105228 


3os taurus tuftelin 
mRNA. complete cds 


0.007 


-3036S02 


f AL022373) putative protein 


3.9 


881 


I 
i 

AF 1 00694 c 


Vlus musculus 
'ontin52 mRNA. 
omplete cds 


0.007 1 


( 

1 

< 
t 
I 
1 

y 

25O0S14 a 


H \ PU 1 Hfc 1 K.AL bU.2 KIT 
PROTEIN T27F2.I IN 
CHROMOSOME V 
>gi|38803 1 llgnl|PID|e 1 349855 
3X42 (SW:BX42_DROME); 
:DNA EST EMBL.C07233 
omes from this gene; cDNA 
1ST EMBL:C08532 comes 
rom this gene; cDNA EST 
k501hl0.3 comes from this 
ene; cDNA EST yk501f 1 .3... 


3.8 



WO 01/02568 



PCT/USOO/18374 



SEC 
ID 


*H Neares 
J 

ACCESSIOI 


I Neiehbor (BlastN vs. 
«J DESCRIPTION 


Genbankl 
P VALUE 


Nearest Net si 
ACCESSION 


lbor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION Jp VALUE 


882 


X93567 


L.major mRNA for 
beta-tubulin { 1404bp 


> 0.007 


1 2317862 


(U78289) ry lactone synthase 
modules 4 & 5 [Streptomyces 
fradiae] 


3.0 


883 


AB012106 


Brassica rapa mRNA 
for SRK45. complete 
cds 


0.007. 


I 3881103 


\m-vj40^o) predicted using 
Genefinder, cDNA EST 
EMBL:D76407 comes from this 
gene; cDNA EST 
EMBL:C08999 comes from this 
gene; cDNA EST yk!99bl2.5 
comes from this gene; cDNA 
EST yk282a4_5 comes from this 
gene; cDNA EST EMBL:C0... 


2.7 


884 


AF041056 


Homo sapiens 
WSCR4 gene, e.xons 
3 and 4 


0.007 


! 135817 


THROMBtN RECEPTOR 
PRECURSOR human 
>gi|339677 (M62424) thrombin 
receptor [Homo sapiens] 


2.2 


885 


Af 093268 


Rattus norvegicus 
homer- 1c mRNA. 
complete cds 


0.007 


1723518 


HYPOTHETICAL 32.2 KD 
PROTEIN C22E 12.04 IN 
CHROMOSOME I >gi|1220279 
(Z70043) unknown 


2.1 


886 


M74798 


Hevea brasiliensis 3- 
hydroxy-3- 
methylglutaryl- 
coenzyme A 
reductase gene. 3' 
end. 


0.007 


I0012S2 


(D64003) polyA polymerase ! 


\ 

1.9 


887 


1 
t 

Z62997 t 


I.sapiens CpG DNA. 
:lone 76g 1 1 . reverse 
caa cpg/021 i.rtla . 


0.007 • | 


< 

1 176532 s 


H V POTHEtYc" aL I'll .9 KD "~t 
PROTEIN C34E 10.8 IN 
CHROMOSOME LQ | 
>gi|500731 (U 10402) weakly 
imilar to protein C kinase 
ubstrate fCaenorhabditis 1 


1.8 


888 


5 
h 

AF074386 n 


ambucus nigra 
evein-like protein 
iRNA. complete cds 


0.007 


I 

f 

P 
[ 
> 

P 

2498317 f 


JVA-1 polyproteitt f 

'RECURSOR nematode | 
olyprotein antigen precursor 
Dictyocaulus viviparus] 
gi|1585421|prf||2124414A 1 
olyprotein antigen/allergen 
Dicivocaulus viviparus] 


1.2 


889 


S 

(s 
d 

L29426 c( 


ynechocystis species 
►train PCC 6803) 
rgA gene, complete 
is. 


0.007 


(. 

3S82275 It 


\B0IS320) KIAA0777 protein 
■lomo sapiens] 


1.1 
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Nearest Neighbor (BlastN vs. Gcnbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



890 



D83329 



JMus muse ul us DNA 
Ifor prostaglandin D2 
(synthase, complete 
leds 



0.007 



1001741 



(D64004) hypothetical protein 



0.97 



891 



AB012106 



Brassica rapa mRNA 
for SRK45. complete 
cds 



0.007 



1723928 



892 



ISambucus nigra 
Iribosome inactivating 
[protein precursor 
U76524 I mRNA. complete cds 



0.007 



121452 



HYPOTHETIC? AL 11.6 KD 
PROTEIN EN NUT1-AR02 
INTERGENIC REGION 
PRECURSOR YGL149w - 
yeast (Saccharomvces 



myc 

bLUl'HNlN. HIGH 
MOLECULAR WEIGHT 
SUB UN IT 12 PRECURSOR 
>gi|82606|pir||A24266 glutenin 
high molecular weight chain 12 
precursor - wheat >gi|21779 



0.94 



0.79 



893 



AF027173 



lArabidopsis thaliana 
cellulose synthase 

(catalytic subunit (Ath 
A) mRNA. complete 
cds 



894 



Rsapiens IMAGE 
YU918 IcDNA clone 268SI 



0.007 



927287 



0.007 



1055188 



(U30294) ORF2 [Prevoteila 
ruminicola] 



(U40061) contains similarity to 
transmembrane domains like 
those found in sugar transporter 
proteins 



0.35 



0.26 



895 



896 



897 



898 



899 



Mus Musculus 
[alpha A-crystallin- 
L36827 [binding protein I 



L36827 



iMus Musculus 
alphaA-crystallin- 
I bind ing protein I 



0.007 



4063019 



Z65719 



Rsapiens CpG DNA. 
Iclone 54c 10, reverse 
read cpg54c lQ.rtla . 



0.007 



4063019 



0.007 



1097307 



Helianthus tuberosus 
lectin 1 mRNA, 
AF064029 complete cds 



[Mus musculus 
cathepsin S (CatS) 
AF051730 |gene. e.xon 6 



0.007 



1174915 



(.AF0S3061) ABC transporter 
TliF [Pseudomonas fluorescensl 



(AF083061) ABC transponer 
TliF [Pseudomonas fluorescens' 



HIC- 1 gene [Homo sapiens! 



UTROPHIN (DYSTROPHEN- 
RELATED PROTEIN 1) 
(DRP1) (DRP) 

>ai|284488|pir||S2838I utrophin 
protein) [Homo sapiens] 



0.21 



0.20 



0.20 



0.002 



0.007 



1707017 



(U7S72I) RNA helicase isolog 
[Arabidopsis thaliana] 



0.001 



WO 01/02568 
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i Nearest 


Neighbor (BlastN vs. 


Qenbank) 


Nearest Nemhbor iBlastX vs. Non-Redundant Pmteino 


sec; 

ID 


> 

ACCEssior 


<i DESCRIPTION 


P VALUE 


ACCESSION 


| DESCRIPTION 


P VALUE 






Oryctolagus 










900 


I U62398 


cunipii hie 

gp42/basigin/OX- 
47/HT7 mRNA, 
complete cds. 


0.007 


2370494 


(Z98944) hypothetical orotein 


2e-04 


901 


I X76341 


M.musculus 
gJutathione reductase 
mRNA. 


0.007 


3513303 


(AC005594) R26984_l [Homo 
sapiens] 


8e-07 ~ 


902 


I M26215 


Rat (lambda 20B0.5> 
M-type 6- 
phosphofructo-2- 
kinase/fructose-2, 6- 
bisphosphatase 


0.007 


3036809 


(AL022373) putative protein 


6e-15 


903 


AB007902 


Homo sapiens 
tvi/\/\u*wz rruviNA. 
partial cds 


0.007 


2662165 


(AB007902) mi61l2 cDNA 
clone for KIAA0442 has a 574- 
bp insertion at position 1474 of 
the sequence of KIAA0442. 
Homo sapiens] 


2e-17 


904 


U93364 


Lactococcus lactis 
cremoris pi as mid 
pNZ4000 insertion 

putative transposase 
gene and eps gene 
cluster 

t,epsr\_.\ADi_utrl_iril 
JKL). complete cds 


0.007 


2731377 


(U28739) similar to alcohol 
dehydrogenase/ribitol 
dehydrogenase [Caenorhabditis 
eiesans] 


le-31 


905 


AF093268 


Rattus norveeicus 
homer- 1c mRNA, 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


906 I 


AF 100694 


Mus musculus 
complete' cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


907 1 


1 

AF074386 i 


Sambucus nigra 
levein-like protein 
tiRNA. complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


908 


/ 
c 
c 

I 

AF027174 c 


Vabidopsis thaliana 
ellulose synthase 
atalytic subunii (Ath- 
i) mRNA, complete 
ds 


' 0.006 


<NOiNE> 


<NONE> 


<NONE> 


909 1 


n 
n 

AJ005813 e 


Vrabidopsis thalianu 
fiRNA for 
eoxanthin cleavage 
nzvme 


0.006 


<NONE> 


<NONE> 


<NONE> 
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i Nearest 


Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redunrfant Pmr«n«^ 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















910 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA. complete 
cds 


0.006 


«cNONE> 


<XNUrNt> 


<none> 


911 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.006 




<NONE> 


<none> 


912 


AF093268 


Rattus norvegicus 
homer- lc mRNA. 
complete cds 


0.006 


<NONE> 


<NONE> 


<N0NE> 


913 


AB012106 


3ra^^ic3 rana mRNA 

for SRK45, complete 
cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


914 


AF064029 


Helianthus luberosus 
lectin 1 mRNA. 
complete cds 


0.006 


<NONE> 


<NONE> 


<NONE> 


915 


AF 100694 


Mus musculus 
Pontin52 mRNA, 

romnlpft* 


0.006 




<NONE> 


<NONE> 


916 


AF093268 


Rattus norvegicus 
homer- lc mRNA. 
complete cds 


0.006 


4049856 


(AF063866) ORF MSV064 
hypothetical protein 
[Melanoplus sanguinipes 
entomopoxvirus] 


9.6 




917 


] 

AF 100694 < 


Mus musculus 
Pontin52 mRNA. 
:ompIete cds ! 


0.006 


3880536 i 


(/.SSI) lb) predicted using 
Genefinder; similar to Lectin C- 
type domain short and long 
forms (2 domains); cDNA EST 
EMBL:CI0633 comes from this 
sene; cDNA EST 
EMBL;CI2424 comes from this 
gene; cDNA EST ylcl91e7.3 
:omes from this ... 


7.9 




918 


r 
F 

AF012899 r 


Sambucus nigra 
ibosome inactivatins; 
)rotein precursor 
nRNA. complete cds 


0.006 


( 
I 

3877761 ( 


ZS1552)F56G4.1 
Caenorhabditis elegans] 
>gi|3S7S615|gnl|PID|el348240 
ZS3 1 18) F56G4.1 


7.5 




919 


} 

r 

X80289 t 


Lsapiens PTPL1 
nRNA for protein 
vrosine phosphatase 


0.006 


C 
F 

116S791 ( 


:athepsin e precursor 

recursor - rabbit >gi|402729 
L0S41S) procathepsin E 


7.4 
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i Nearest 


Neighbor (BlastN vs. Cenbartk; 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSIOI^ 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












DIACYLULYCEKOE 




920 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA comnl*»te cH\ 


0.006 


1346371 


KINASE. BETA ' 

DIACYLGLYCEROL 

KINASE) 

>»il477059lnirllA<i77Ail 
diacylglycerol kinase (EC 
2.7.1.107) beta - rat 90kDa- 


5.5 


921 


U72396 


Lycopersicon 
csculentum class II 
small heat shock 
protein Le-HSPl7.6 
mRNA. complete cds 


0.006 


2196567 


(DS8588) lipoprotein 
[Escherichia coli] 


4.3 


922 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


0.006 


2113798 


(ZS3259) AmphiBrf38 
IBranchiostoma floridael 


4.3 


923 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.006 


1388166 


(U582S2) Bowel [Drosophila 
melanoaaster] 


4.3 


924 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.006 


2496785 


HYPOTHETICAL 20.1 KD 
PROTEIN Y4YS 


4.2 


925 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cd.s 


0.006 


416592 


a-acjCjLU' 1'LNIn 
ATTACHMENT SUB UNIT 
PRECURSOR 
>gi|10U70|pir||A4I258 a- 
tiuuiuLinin (_ure protein - 
yeast (Saccharomyces 
cerevisiae) 


2.7 


926 


AF064029 


-lelianthus tuberosus 
ectin 1 mRNA, 
:omplete cds 


0.006 


416592 . 


A-ACiGLU 1'INLN 
ATTACHMENT SUB UNIT 
PRECURSOR 
>gi|10ll~0|pir||A41258 a- 
igglutinin core protein AGA1 - 
it-ast fSaceharomyces 
.erevisiae) 


2.5 


927 


AJ0O5813 ( 


Arabidopsis thaliana 
tiRNA for 
leoxanthin cleavage 
:nzvme 


0.006 


( 

3258584 1 


L'41263) The 3' UTR of this 
:cne overlaps the 3" UTR of 
ri9D12.6(confirmed by EST 
lits) [Caenorhabditis elesans] 


2.0 


928 


i 

c 

1 

U33949 c 


-luman Down 
Syndrome region of' 
•hromosome 2 1 . 
tenomic sequence, 
lone A12H1-1.A6. 


0.006 


I 

3S50997 


AF067150) beta-hydroxyacyl- 
\CP dehydratase precursor 


1.9 
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Nearest 


Neiahbor (BlastN vs. Genbank) 


. Nearest Neiehbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIOIS 


( DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1175 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA. complete 
cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1176 


Y09232 


Fl.sapiens fertilin 
alpha pseudogene 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1177 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


2e-04 


■ <NONE> 


<NONE> 


<NONE> 


1178 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1179 


AF072847 


Homo sapiens 
putative swelling- 
activated chloride 
channel (CLNS1AJ 
gene, intron 6 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1 ISO 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1131 


U76524 


Sambucus nigra 
ribosome inactivating 
Jrotein precursor 
mRNA. complete cds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1182 


t 
i 
c 

AF027173 c 


\rabidopsis thaliana 
ellulose synthase 
atalytic subunit (Ath- 
\) mRNA. complete 
ds 


2e-04 


( 
( 
t 
f 
1 

1213557 e 


;U50199) coded for by C. 
:legans cDNA yk89e9.5; coded 
or by C. elegans cDNA cm7g5; 
:oded for by C. elegans cDNA 
•ml4b9; coded for by C. 
•legans cDNA yk52g5.5; coded 
or by C. elegans cDNA 
>k76e5.5; coded for by C. 
leaanscDNA vk!31fl 1.5; c... | 


8.4 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


Neares 

} 

ACCESS lOl 


Neighbor (BlastN vs. 
V DESCRIPTION 


Genbank) 
P VALUE 


i Nearest Neiah 
I ACCESSION 


bor (BlastX vs. :Non-Redundant F 
DESCRIPTION 


'roteins) 
P VALUE 


1 133 


| nrU VVJ i l j 


I t/rr>rv»rcir"nn 

esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


2e-04 


1 729008 


bWlHfcLlAL DliLUTDIN 

" DOMAIN KfcLfcH'Uin 

PRECURSOR (TYROSINE- 
PROTEIN KINASE CAK) 
(CELL ADHESION KINASE) 
(TYROSINE KINASE DDR) 
(DISCOIDIN RECEPTOR 
TYROSINE KINASE) (TRK E, 
(PROTEIN-TYROSINE 
KINASE RTK 6) satjiensl 


1 

8.3 1 




| ArU 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


2e-04 


2507582 


HYPoIReTJICaL 138.1 KD " 
PROTEIN IN MOLR-BGLX 
INTERGENIC REGION 
>gi| 1788436 (AE000300) 
putative regulator [Escherichia 
colt] 


7.8 J 


1185 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


2e-04 


1085500 


collagen alpha 1(IX) chain - 
mouse musculus] 
>gi|744962|prf||2015346A 
collagen:SUB UNIT=alphal:ISO 
TYPE=IX fMus musculusl 


7.8 J 


11861 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA. complete 
cds 


2e-04 


2623967 


(Y13942) GTN Reductase 
[Agrobacterium radiobacterl 


7.4 1 


1187) 


i 
r 
r 

AJ005813 e 


\rabidopsis thaliana 
nRNA for 
eoxanthin cleavage 
nzvme 


2c-04 I 


< 
I 
F 

( 
£ 

2497316 t 


AUv ANLhlJ 

GLYCOSYLATION END 

PRODUCT-SPECIFIC 

RECEPTOR PRECURSOR 

RECEPTOR FOR 

ADVANCED 

"il YPO^vt iiTinw t:\rp» 

-11- I v_ J 1 L^j~\ 1 I \_J I N r. 1 N \ J 

PRODUCTS) products receptor 
irecursor - bovine >gi| 16365 1 
M91212) receptor for advanced 
lycosylation end products [Bos 
iurus| 


5.3 J 


1188 1 


{■ 
c 
c 
B 

AF027174 c 


irabidopsis thaliana 
ellulose synthase 
atalytic subunit (Ath- 
) mRNA. complete 
is 


2e-04 j 


1001710 (I 


364004) hvpothetical protein 


3.5 1 



2/A/ 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Arabidopsis thaliana 






(U4 1263) The 3' UTRotthis 




1189 


AJUUjo I j 


mRNA tor 
neoxanthin cleavage 
enzvme 


2e-04 


3258584 


gene overlaps the 3' UTR of 
T19D12.6(confirmed by EST 
hits) [Caenorhabditis eleeans] 


2-1 


1190 


AF027173 


Arabidopsis thaiiana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


2e-04 


2736338 


(AF038623) contains similarity 
to RNA recognition motifs 


0.89 


1191 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSPI7.6 
mRNA. complete cds 


2e-04 


2196567 


(D88588) lipoprotein 
[Escherichia coli] 


0.69 


1192 


AF090U5 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSPI7.4) mRNA. 
complete cds 


2e-04 


3319874 


(AJO06096) F-spondin 
Branchiostoma floridae] 


5e-04 


1193 


L26049 


Chlamydomonas 
reinhardtii dynein 
heavy chain alpha 
(ODA1 1) gene, exons 
2- 15, and partial cds. 


2e-04 


3876775 


(Z81077) predicted using 
Genefinder: Similarity to Yeast 
protein 8248 (TR:G587531) 


2c-09 


1194 


AF 100694 


Mus musculus 

runiinj. rn rxi n .-\, 

complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1195 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


le-04 


<NONE> 


<NONE> 


<NONE> 


1196 


L34219 


rlomo sapiens 
reti naldehyde-binding 
protein (CRALBP) 
aene. complete cds. 


le-04 


<NONE> 


<NONE> 


<NONE> 


1197 


X51S90 


ihesus monkey 
nterleukin-3 Bene 


le-04 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






r L«i£fnuuiuin 










1198 


AE001421 


falciparum 
chromosome 2. 
section 58 of 73 of 
the complete 
sequence 


Ic-04 


<NONE> 


<NONE> 


<NONE> 


1199 


AF090115 


Lycopersicon" 
esculentum cytosolic 
class II small heat 
shock protein Ht- 1 ± 
(HSP17.4) mRNA. 
complete cds 


Ie-04 


<NONE> 


<NONE> 


<NONE> 


1200 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


le-04 


2576287 


(Y15086)HepC protein 
[Cylindrotheca fusiformis) 


4.7 


1201 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


le-04 


3j95673 


(ABO 16623) RWC-3 [Oryza 
satival 


0.14 


1202 


AFO3S035 


Homo sapiens 
BRCA [-associated 
RING domain protein 
(BARD I) gene, 
exons 2 and 3 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1203 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1204 


ABO 12 106 


Brassica rapa mRNA 

fnr CD V i\ nl^ra 

tor jKMj, compicie 
cds 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1205 


U95098 


Xenupus laevis 
mitotic 

phosphoprotein 44 
mRNA, partial cds 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1206 


AF034099 


Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA. complete cds 


9e-05 


1351553 


HYPOTHETICAL 
LIPOPROTEIN MG348 
PRECURSOR 
>gi|l36166S|pir||E64238 
lypotheiical protein MG348 - 
Mycoplasma genitalium (SGC3) 
>ai|3S44931 


8.8 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1207 


D50006 


Human DNA for 
aipha-platelet-derived 
growth factor 
receptor, exon 6-10 


9e-05 


3063639 


(AF056494) NADH 
dehydrogenase subunit 5 
[Panorpa japonica] 


5.1 


1203 


U50423 


Human Down 
Syndrome region of 
chromosome 21, 
Clone A41B8-1B7. 


9e-05 


124273 


INHIBIN ALPHA CHAIN 
PRFPrrRSOR hnvine 
>gi|163l95 (M13273) inhibin A 
subunit [Bos taurus] 


3.0 


1209 


AJ0058 13 


A r*ih»Hrs r»c ic tri'illfinn 
r\l JOluULisio ultiiifaiiia 

mRNA for 
nebxanthin cleavage 
enzyme 


ye-uj 




dihydroxybenzoate 
monooxygenase [Sphingomonas 
spi 


"> i 

—.-j 


1210 


AC005276 


Homo sapiens clone 
fragment 

uwot-igapj rrom 
7q3l.3, complete 
sequence [Homo 
sapiens) 


9e-05 


1492075 


(U60315) MC132L [Molluscum 
contasiosum virus subtype I] 


1.0 


1211 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


9e-05 


2887423 


(AB007884) KIAA0424 [Homo 
sapiens 1 


2e-10 


1212 


X77772 


C.l'uscus gamma-M2- 
1 crystatlin mRNA. 


9e-05 


2072425 


crystallin like protein [Homo 
sapiensl 


7e-25 


1213 


ABU 12 lUo 


Brass ica rapa mRNA 
for SRK45, complete 
cds 


OC-Uj 




<MONE> 


<NONE> 


1214 


L06178 


Apis mellifera 
lisustica complete 
mitochondrial 
aenome 


ae-uj 






<NONE> 


1215 


/AOL/ L luvj 


Brassica rapa mRNA 
for SRK45, complete 

cds 


8e-05 


<NONE> 


<NONE> ' 


<NONE> 


1216 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1217 


L06178 


Apis mellifera 
ligustica complete 
mitochondrial 
>;enome 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1218 


ABO 12 106 


Brassica rapa mRNA 
for SRK45, complete 
cds j 


Se-05 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neishbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1219 


API 00694 


Pomin52 mRNA. 
complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1220 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1221 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


1722841 


WNT-11 PROTEIN 
PRECURSOR (XWNT-1 1) 
clawed frog >gi|439108 
(L23542) maternal protein 


9.9 


1222 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


1205991 


(U35637) nebulin [Homo 
sapiensj 


9.6 


1223 


AF024605 


Homo sapiens serine 
protease-like protease 
Sequence 2 from 
patent US 5736377 


8e-05 


3242783 


(AF055354) respiratory burst 
oxidase protein B 


8.6 


1224 


Y13148 


Rattus norvegicus 
mRNA for PAG608 
gene 


8e-05 


2314243 


(AE000616) alpha-ketoglutarate 
permease (kstP) 


8.1 


1225 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzvme 


8e-05 


1170586 


RAS Ul fASk- AC 1 IV A 1 INb- 
LtKE PROTEIN IQGAPI 
(P195) (KIAA0051) 
>gi|627594|pir||A54S54 Ras 
GTPase activating-related 
protein - human sapiens] 
>gi|536S44 (L33075) ras 
GTPase-activating-like protein 
Homo sapiensl 


7.8 


1226 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA. complete 
cds 


8e-05 


464239 


NADH-UBIQUINONE 
OX1DOREDUCTASE CHAIN 
4>gi|10S5lS5|pir[|S5:96S 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
[Apis mellifera lijiustica] 


3.5 


1227 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


8e-05 


544353 


F-SPONDIN PRECURSOR 


3.5 



Utti 
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Nearest Neighbor iBIastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















122S 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-05 


483243 


apolipoprotein B-100 - chicken 
(fragment) 


3.4 


1229 


AF093268 


Rattus norvegicus 
homer- lc mRNA. 
complete cds 


8e-05 


91207 


proline-rich protein - mouse 
(fraement) musculus] 


2.2 


1230 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONADHESIN PRECURSOR 
>si| 1066466 


2.2 


1231 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


8e-05 


2499181 


ZONADHESIN PRECURSOR 
>gi| 1066466 


1.9 


1232 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


8e-05 


2833647 


(AF027972) flagelliform siik 
protein fNephila clavipes] 


1.6 


1233 


AF09326S 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


8e-05 


1 163063 


(Z49821) MY02 
[Saccharomvces cerevisiae] 


0.90 


1234 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


16534SS 


(D90914) hypothetical protein 


0.30 


1235 


M26510 


Chicken nonmuscle 
myosin heavy chain 
(MHC) gene, 
complete cds. 


8e-05 


112159 


plectin - rat 


0.003 


1236 


U56402 


Human chromatin 
structural protein 
homo log 


8e-05 


2088823 


(AF003384) weak similarity to 
the peptidase family A2 


le-13 


1237 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


8e-05 


437181 


(U022S9) GTPase-activating 
protein [Caenorhabditis eleeans] 


2e-17 


123S 


AF 100694 


Mus musculus 
Pontin5: mRNA. 
complete cds 


8e-05 


465983 


HYPOTHETICAL 80.8 KD 
PROTEIN ZC21.4 IN 
CHROMOSOME III 


8e-27 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest NeiehbortBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1239 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1240 


U83656 


Rattus norvegicus NF- 
KB gene, promotor 
reaion 


7e-05 


38808S8 


(AL031633) predicted using 
Cenefinder; cDNA EST 
yk304f!2.5 comes from this 
gene [Caenorhabditis elegansl 


9.3 


1241 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


7e-05 


3080538 


(AL022600) hypothetical 
protei n 


9.2 


1242 


X39398 


H.sapiens ung gene 
for uracil DNA- 
glycosvlase 


. 7e-05 


549700 


HYWJTRbl'lLAL li.t KD " 
PROTEIN IN MDH1-VMA5 
INTERGENIC REGION 
>gi|539lS2|pir||S3790S 
hypothetical protein YKL083w - 
yeast (Saccharomyces 
cerevisiae) >gi|486120 
(Z280S2) ORF YKL083w 


1.8 


1243 


M83753 


Bovine follicle 
Stimulating hormone- 
beta subunit gene, 
complete cds. 


7e-05 


2398621 


(AJ000342) DMBT1 protein, 
5.8 kb transcript [Homo sapiens] 


1.8 


1244 


M80829 


Rat troponin T 
cardiac isoform gene, 
complete cds 


5e-05 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


2e-08 


1245 


AF074387 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


4e-05 


120240 


FLAGELLIN B2 PRECURSOR 
Methanococcus voltae 
>ai|150063 (M72148) flasellin 


5.2 


1246 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


<NONE> 


<NON"E> 


<NONE> 


1247 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRN'A. complete cds 


3e-05 


<N0NE> 


<NONE> 


<NONE> 


124S 


AF0743S6 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neighbor f BlasiN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


ceo 

ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Rattus norvegicus 










1249 


AF093268 


homer- lc mRNA. 
complete cds 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1250 


ABO 12 106 


Brassica rapa mRNA 
for SRK45. complete 
cds 


3e-05 


2773226 


(AF039716) Similar to protein 
kinase [Caenorhabditis elegans] 


6.7 


1251 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-05 


2072961 


(U93568) putative pl50 [Homo 
sapiens] 


5.6 


1252 


U72396 


Lycopersicon 
esculentum class II 
small heat shock 
protein Le-HSP 17.6 
mRNA. complete cds 


3e-05 


121855 


ExOGLbCAjNASh 11 
PRECURSOR cellulose 1.4-beta 
cellobiosidase (EC 3.2. 1.91) II 
precursor - fungus (Trichoderma 
reesei) 1,4-beta-cellobiosidase 
(EC 3.2. 1.91) II- fungus 
cellobiohydrolase II 
[Trichoderma reesei] 


4.6 


1253 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


3880516 


(AL021572) similar to CTP 
SYNTHASE (EC 6.3.4.2) (UTP- 
- AMMONIA LIGASE) (CTP 
SYNTHETASE) 


3.3 


1254 


M88299 


Mouse brain- 1 POU- 
domain protein, 
complete cds. 


3e-05 


1947043 


(U66102) intimin [Escherichia 
coli] 


3.0 


1255 


U95098 


Xenopus laevis 
mitotic 

phosphoprotein 44 
mRNA. partial cds 


3e-05 


3122872 


CELL-CYCLE NULLhAK 
AUTO ANTIGEN SG2NA 
(S/G2 NUCLEAR ANTIGEN) 
>gi|1082650|pir||JC2522 nuclear 
autoantigen- human >gi|S05095 
(U179S9)GS2NA 


2.8 


1256 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-05 


1352145 


\* i lUt_ rlK^Jivic I- \j t\.LLJ.-\jc 

POLYPEPTIDE I chain I - 
Thermus aquaiicus >gi| 155083 
(M8434 1) cytochrome c oxidase 
subunits precursor [Thermus 
thermophilus] 


2.6 


1-257 


U72396 


Lycopersicon 
esculentum class II 
small heal shock 
protein Le-HSP17.6 
mRNA. complete cds 


3e-05 


281 1015 


SEGMENTATION pOLaRitV 
PROTEIN ENGRAILED 
>gi|2076747 (U42429) 
engrailed (Anopheles gambiae] 
>gi|2 148918 (L'42214) 
enerailed [Anopheles gambiae] 


2.0 



4£ I 
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Nearest Neighbor (BlastN vs. Genbank) j 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1258 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA. complete 
cds 


3e-05 


1657752 


(U62325) FE65-Iike protein 
[Homo sapiens] 


1.7 


1259 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


3e-05 


2072961 


(U93568) putative pI50 [Homo 
sapiens) 


1.5 


1260 


U76523 


Sambucus nigra lectin 
precursor mRNA. 
complete cds 


3e-05 


1352145 


Cytochrome c oxidase 

POLYPEPTIDE I chain I - 
Thermus aquaticus >gi| 155083 
(M84341) cytochrome c oxidase 
subunits precursor [Thermus 
thermophilus) 


1.1 


1261 


X91890 


H.sapiens regulatory 
region ofHOXA7 
gene 


3e-05 


111013 


Sxr (Bkm-homolog) sex- 
determining region protein - 
mouse 


1.0 


1262 


L36936 


Homo sapiens rnctasc 
sene, partial cds. 


3e-05 


1944352 


(D84239) IgG Fc binding 
protein [Homo sapiens! 


0.99 


1263 


AB012 105 


Brassica rapa mRNA 
for SLG45. complete 
cds 


3e-05 


4 L 7782 


SMP2 PROTEIN 
>gi|320853|pir||S30911 SMP2 
protein - yeast (Saccharomyces 
cerevisiae) gene 
(Saccharomvces cerevisiae] 


0.S9 


1264 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


3e-05 


1708501 


INTEGRIN ALPHA CHAIN- 
LIKE PROTEIN alpha tntlp 
[Candida albicansl 


0.39 


1265 


AF090 U5 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP 17.4) mRNA. 
complete cds 


3e-05 


1587031 


cis-Golgi matrix protein GM130 
[Rattus norveaicusj 


0.20 


1266 


Z31014 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXSS7 
on chromosome X * 


3e-05 


2072964 


(U93569) putative pl50 [Homo 
sapiens] 


0.049 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins 


SEQ 
tD 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












glycosylated ana mynstilated 




1267 


Z96668 


H.sapiens telomeric 
DNA sequence, clone 
7PTEL0OI, read 
7PTELOO001.seq 


3e-05 


542429 


smaller surface antigen - 
Plasmodium falciparum 
>gi|836640 (X76298) 
glycosylated and mynstilated 
smaller surface antigen gallus] 
>gi|1092178|prf]|2023165B 
surface antigen 


0.029 


1268 


AB012105 


Brass ica rapa mRNA 
for SLG45, complete 
cds 


3e-05 


3879121 


(Z7031O) predicted using 
Genefinder; Similarity to Mouse 
ankyrin (PIR Acc. No. S37771); 
cDNA EST EMBLT0I923 
comes from this gene; cDNA 
EST EMBL:D32335 comes 
from this gene; cDNA EST 
EMBL:D32723 comes from this 
gene; cDNA ES... Genefinder, 
Similarity to Mouse ankyrin 
(PIR Acc. No. S37771); cDNA 
EST EMBLT0I923 comes 
from this gene; cDNA EST 
EMBL:D32335 comes from this 
gene; cDNA EST 
EMBL:D32723 comes from this 
gene;cDNA ES... 


2e-13 


1269 


AF074385 


Sambucus nigra 
hevein-like protein 
rnRNA. complete cds 


3e-05 


2497677 


ZYXIN (ZYXIN 2) sapiens] 
>gi|1545954|gnl|PID|e223417 
(X95735) zvxin 


2e-23 


1270 


AF027174 


Ar.ihirinnsi^ rhnlinnn 

cellulose synthase 
catalytic subunit (Ath- 
B) mRNA. complete 
cds 


le-05 


<NONE> 


<NONE> 


<NONE> 


1271 


X16318 


Canine mRNA for 
>igna! recognition 
Jiirticle 54k protein 


le-05 


( 

3122612 1 


PITUITARY HOMEOBOX 3 
HOMEOBOX PROTEIN 
°ITX3) >gi|2645427 
AF005772) homeobox protein 
3 itx3 fiVIus musculus| 


4.4 


1272 


ABO 12 105 .. 


3rassica rapa mRNA 
or SLG45, complete 
ds 


le-05 


( 
f 

1652458 s 


D90905) DNA mismatch repair 
jrotein MutL [Synechocystis 
Pi 


0.62 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1273 


U57843 


Human 

phosphatidylinositol 
3-kinase delta 
catalytic subunit 
mRNA. complete cds 


le-05 


475909 


(X67098) ORF1A [Homo 
sapiens) 


0.22 


1274 


Z96569 


H.sapiens telomeric 
DNA sequence, clone 
2QTEL054, read 
2QTELOO054.seq 


le-05 


2137043 


unknown protein - rabbit 
(fragment) cuniculus] 


0.0O5 


1275 


AE0008 10 


Methanobactcrium 
thermoautotrophicum 
from bases 172512 to 
1S2957 (section 1 6 of 
148) of the complete 
genome 


le-05 


3877579 


^t,ui, i i j cmTTTTTirTTj co .»njaac 

kinensin-like protein KIF4 
(SW:P33174); cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL:D2732 1 comes from this 
gene; cDNA EST 
EMBLD35764 comes... Mouse 
kinensin-like protein KEF4 
(SW:P33174); cDNA EST 
EMBL:D27320 comes from this 
gene; cDNA EST 
EMBL:D27322 comes from this 
gene; cDNA EST 
EMBL;D27321 comes from this 
gene; cDNA EST 
EMBL:D35764 comes... 


6e-27 


1276 


AB012I13 


Homo sapiens gene 
for CC chemokine 
PARC precursor, 
complete cds 


9e-06 


<NONE> 


. <NONE> 


<NONE> 


1277 


AC005830 


Homo sapiens Xpi'i- 
154-155 B AC GSHB- 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiens! 


9e-06 


<NONE> 


<NONE> 


<NONE> 


1278 


DS6245 


Human MHC (HLA) 
DRB intron 1 DNA, 
partial sequence 


9e-06 


1051253 


(U37531) mucin apoprotein 
[Mus musculus] 


1.3 


1279 


D79998 


Human mRNA for 
KIAA0176 gene, 
partial cds 


9e-06 


2833253 


HYPOTHETICAL PROTEIN 
KIAA0176 sapiens] 


4e-06 



^54 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(/.oyojjj iimuanty to least 




1280 


U 10246 


Toxoplasma gondii 
RH uracil 
phosphoribosyl 
transferase gene, 
complete cds. 


9e-06 


3876090 


undine kinase 

(SW:URK1_YEAST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL.D67355 comes from this 
gene; cDNA EST yk209hl.5 
comes from this ae... 


7e-33 


1281 


U 10246 


Toxoplasma gondii 
RH uracil 
phosphoribosyl 
transferase gene, 
complete cds. 


9e-06 


3876090 


(/.OVbjJ,) iimuanty to least 
uridine kinase 

(SW:URK1_ YEAST); cDNA 
EST EMBL:Z14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209h 1.5 
comes from this se... 


7e-34 


1282 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


1283 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


8e-06 


<NONE> 


<NONE> 


<NONE> 


1284 


U66340 


Human Rh blood 
group C antigen 
(RHCE) gene, exon 
2. partial cds 


8e-06 


1707155 


(U80837) F07E5.6 gene product 
[Caenorhabditis elcgans] 


9.6 


1285 


AFO 12 899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-06 


<NONE> 


<NONE> 


<NONE> 


1286 


M29930 


Human insulin 
receptor (allele 2) 
gene, exons 14, 15. 
16 and 17. 


4e-06 


<NONE> 


<NONE> 


<NONE> 


1287 


L42103 


Homo sapiens 
(subclone 5_d3 from 
PI H25) DNA 
sequence. 


3e-06 


<NONE> 


•. <NONE> 


<NONE> 
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Nearest Neighbor 'BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redund.im Pmt<-in<i> 


SEQ 
ID 


ACCESSIOIv 


( DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1288 


AFO 12244 


ccrberus-like (Ccr-I) 
gene, complete cds 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1289 


Z69366 


Human DNA 
sequence from 
cosmid L96F8. 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1290 


Z69366 


Human DNA 
sequence from 
cosmid L96F8. 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains EST. 


3e-06 


<NONE> 


<NONE> 


<NONT£> 


1291 


X85232 


H.sapiens 
chromosome 3 
sequences 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1292 


M32674 


Human platelet 
glycoprotein Ilia, 
exons 7. 8 and 9. 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1293 


D 16879 


Human HepG2 3' 
region cDNA. clone 
hmd2a0t 


3e-06 


998296 


(U33484) ependymin 
[Hemiodus sp.] 


5.6 


1294 


U186I4 


Lagothrix lagotncha 
interphoioreceptor 
retinoid-binding 
protein (IRBP) gene, 
intron 1. complete 
sequence 


3e-06 


1613846 


(U71440) polyprotcin [Rice 
tunero spherical virus] 


5.0 


1295 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
HSP17.4) mRNA. 
:omplete cds 


3e-06 


1477646 


^53204) plectin [Homo 
sapiens] >gi| 1477651 (U63610) 
Jlectin [Homo sapiens] 


4.0 


1296 


AFO 16898 


t-fomo sapiens B-ATF 
jene. complete cds 


3e-06 


i 
i 

1085177 


everse transcriptase - fruit fly 
e verse transcriptase 
Drosophila yakuba] 


3.0 


1297 


I 
t 

A BO 1 8490 r 


-lomo sapiens DNA, 
rinucleotide repeats 
eaion 


3e06 


( 
( 
r 
I 

3876572 f 


Z81522) predicted using 
jenefinder; similar to RNA 
ecognition motif. (akaRRM, 
IBD. or RNP domain) 
Caenorhabditis elegans] 


3.0 



WO 01/02568 



PCT/US00/18374 





Np.irest Neighbor fBlasiN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Pre 


teins) 


SEQ 
ID 


ACCFSSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1298 


Ar0i7 174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunil (Ath- 
B) mRNA, complete 
COS 


3e-06 


4240137 


(AB020631) KIAA0824 protein 
Homo sapiens] 


2.7 


1299 


M37929 


Homo sapiens 
adenosine 
munupiHJajjiitiic 
deaminase 1 
(AMPD1) gene, 
exoris 1 1-12. 


3e-06 


■ 1653775 


(D90916) thiol :disulfide 
interchange protein DsbD 
[Synechocvstis sp-1 


1.7 


1300 


M37929 


Homo sapiens 
adenosine 
monophosphate 
deaminase 1 
(AMPD1) gene, 
exons 1 1-12. 


3e-06 


1653775 


(D90916) thiol:disulfide 
interchange protein DsbD 
[Svnechocvstis sp.l 


1.7 


1301 


U60496 


Glvcine max actin 
(SoyS6) gene, partial 
cds 




1730738 


ACT1N-LIKE PROTEIN ARP5 
Ynl2430p [Saccharomyces 
cerevisiae] 


2e-05 


1302 


Al4jOJ 


Yersinia 

pseudotuberculosis 
rpIC, rplD. rplW, 
rplB and rpsS genes 
for ribosomal proteins 
L3, L4, L23, L2 and 


3e-06 


585879 


50S RIBOSOMAL PROTEIN 

L2 maritima >gi|437926 

{TP 16771 ribosomal protein L2 


2e-12 


1303 


Z34969 


H.sapiens DNA for 

microsateilite 

polymorphism 


2e-06 


<NONE> 


<NONE> 


<NONE> 


1304 


X64707 


H.sapiens BBC1 
mRNA 


le-06 


<NONE> 


<NONE> 


<NONE> 


1305 


ACOO5830 


Homo sapiens Xpii- 
154-155 BAC GSHB- 
52411 (Genome 
Systems Human BAC 
Library), complete 
sequence [Homo 
sapiensl 


Ie-06 


<NONE> 


<NONE> 


<NONE> 


1306 


J04058 


Human electron 
transfer flavoprotein 
alpha-subunit mRNA. 
complete cds. 


le-06 


<NONE> 


<NONE> 


<NONE> 



157 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1307 


L25647 


Homo sapiens 
fibroblast growth 
factor receptor gene 
(located in the central 
MHC) signal peptide 
and consecutive exon 


le-06 


1586734 


mxcQ gene [Methylobacterium 
orsanophilum] 


5.4 


L308 


L26261 


Human MHC class III 
HLA-RP1 gene. 


le-06 


1684985 


(U20633) NADH 
dehydrogenase subunit 
[Neuwiedia veratrifolia] 


1.8 


1309 


AF002233 


Mus musculus alpha- 
actinin-2 associated 
LIM protein mRNA. 
alternatively spliced 
product, complete cds 


le-06 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4e-17 


1310 


M 10935 


Human haptoglobin 
gene (alpha-2 allele), 
complete cds and 
haptoglobin-related 
gene, exon 1 and 
three Alu repeats. 


6e-07 


<NONE> 


<NONE> 


<NONE> 


13U 


AC00225 1 


Homo sapiens 
(subclone l_g6 from 
BAC H76) DNA 
sequence 


4e-07 


2144491 


coagulation factor Xa (EC 
3.4.2 1.61 precursor norvegicus] 


4.2 


1312 


AF047717 


Streptomyces 
chrysomallus 
actinomycin 
synthetase n (acmB) 
gene, complete cds 


4e-07 


699196 


(UI51S1) 4-coumarate-coA 
liaase [Mycobacterium leprae] 


le-06 


1313 


UI4417 


Human Ral guanine 
nucleotide 
dissociation 
stimulator mRNA. 
partial cds. 


4e-07 


544402 


(jU AiNLN'h N ULLbU 1 uJt 
DISSOCIATION 
STIMULATOR RALGDS 
FORM A (R.ALGEF) 
>gi|321257|pir||S28415 guanine 
nucleotide dissociation 
stimulator ralGDS - mouse 
>gi| 193573 (L07924) guanine 
nucleotide dissociation 
stimulator [Mus musculus] 


Se-OS 


1314 


Z79027 


H. sapiens How-sorted 
chromosome 6 
Hindlll fragment. 
SC6pA20GS 


3c-07 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlascX vs. Non-Redundant Proteins) 


SEQ 
ID 




DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1315 


U67167 


intestinal mucin 
(MUC2) gene, 
promoter region and 
partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1316 


ArUso J5o 


Homo sapiens full 
length insert cDNA 
clone t,W*\.\- 1 1 


cn 

Jc-u / 


yjl^i 






1317 


U67228 


Human clone HS4.01 
AIu-Ya5 sequence 


3e-07 


1938437 


(U970O3) contains similarity to 
C4-type zinc fingers and a 
ligand-binding domain of 
nuclear hormone receptors 


2.3 


1318 


U94346 


Human calpain-like 
protease (htra-3) 
mRNA. complete cds 


3e-07 


2911858 


(AF047659) No definition line 
found [Caenorhabditis elegansl 


0.39 


1319 


Y 15724 


Homo sapiens 
SERCA3 gene, exons 
1-7 (and joined CDS) 


le-07 


<NONE> 


<NONE> 


<NONE> 




X13596 


Bean DNA for 

■ -i 1 1 .-.— 11 

glycine-nch cell wall 
protein GRP 1 3 


le-07 


<NONE> 


<NONE> 


<NONE> 


1321 


MS j094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene, 
complete cds, and 
rhohI2aene. 3' end. 


1 a A "7 

le-u/ ( 


1 jzojoj 


(U5875l)C07Gl.7gene 
product [Caenorhabditis 
eiegans] 


R O 


1322 


Z55905 


H.sapiens CpG DNA, 
clone 7 lf4, forward 
read cp°71f4.ftla . 


le-07 


1076802 


extenstn-like protein - maize 
>ai|6001 18 mays] 


0.61 


1323 


X03541 


Human mRNA ol trk 

gb|I96lS6|I96186 
Sequence 23 from 
patent US 5734039 


lc-07 


325465 


(M74509) [Human endogenous 
retrovirus type C oncovirus 
sequence.], gene product [Homo 
sapiens] 


3e-04 


1324 


AF027766 


Canis familiaris Y- 
linked zinc finger 
protein 


le-07 


220643 


(D10628) zinc finger protein 
(Mus musculus) 


7c-08 


1325 


D13613 


Bovine mRNA tor 
rabphilin-3 A, 
complete cds > :: 
dbj|E07S09|E07S09 
cDNA encoding 
rabphilin-3A 


le-07 


2822161 


(AC0040S2) rab3 effector-like: 
35% Similarity to AF007S36 
(PID:g23 17778) [Homo 
sapiens] 


6e-ll 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for c- 






(J04169) aag-onc fusion protein 




1326 


^ X57110 


cbl proio-oncogene 


le-07 


323270 


[Cas NS 1 retrovirus! 


3e-l4 


1327 


X57U0 


Human mRNA for c- 
cbl proio-oncogene 


le-07 


1 15855 


PROTO-ONCOGENE C-CBL 
human >gi|2973 1 (X571 10) c- 
cbl protein [Homo sapiens] 


4e-19 


1328 


AC001178 


Homo sapiens 
(subclone 2 _°12 from 
BAC H94) DNA 
sequence 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1329 


U11866 


Human interleukin-8 
receptor type B 
(ILSRB) gene, 
promoter and exons 1- 
6 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1330 


AC001225 


Homo sapiens 
(subclone 2_e6 from 
BAC H94) DNA 
sequence 


4e-08 


478184 


histone HI II- 1 (clone L95) - 
midse 


6.5 


1331 


M73837 


Human modulator 
recognition factor 2 
(MRF-2) mRNA. 
complete cds. 


4e-08 


141448 


HYPOTHETICAL 32.6 Kb 
PROTEIN IN TR.ANSPOSON 
TN4556 >gi|80758|pir||JQ0428 
hypothetical 32. 6K protein - 
Streptomyces fradiae transposon 
Tn4556 


4.7 


1332 


AC006164 


Homo sapiens clone 
UWGC:y28gap from 
6p2l, complete 
sequence (Homo 
sapjensl 


4e-08 


2580578 


(AF000996) ubiquitous TPR 
motif. Y isoform [Homo 
sapiens] 


1.2 


1333 


X01060 


Human mRNA for 
transferrin receptor 


4e-08 


135514 


T-CELL RECEPTOR BETA 
CHAIN PRECURSOR 
precursor (ANA ID- rabbit 


0.61 


1334 


Y 10697 


H.sapiens INE2 
mRNA 


4e-0S 


124909 


INSULIN RECEPTOR- 
RELATED PROTEIN 
PRECURSOR (IRR) (IR- 
RELATED RECEPTOR) 
>2i|186555 sapiensl 


0.14 


1335 


U60416 


Rattus norvegicus 
myr 6 myosin heavy 
chain mRNA. 
compleie cds 


4e-0S 


102189 


myosin I. high molecular weight 
- Acanthamoeba sp 


3e-0S 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins') 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYfUlHLULAL 35.1 KJJ 




1336 


U23804 


Drosophila 
melanogaster putative 
GTP-binding _ 
regulatory protein 
beta chain (GPB) 
mRNA. panial cds. 


4e-08 


2494916 


TRP-aSP RLPbAl'S 
CONTAINING PROTEIN 
T10F2.4 IN CHROMOSOME 
III protein; similar to G-Beta. 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-28 . 


1337 


AE000213 


Escherichia coli K-12 
MG1655 section 103 
of 400 of the 
complete genome 


4e-08 


. • 3294172 


(AL022325) 1F27C3.1.1 
(protein similar to C. elegans 
protein B0035.I6) (isoform 1) 
[Homo sapiens] 


2e-67 


1338 


D89821 


Mus musculus mRNA 
for RhoM, complete 
cds 


2e-08 


3024539 


RHO-RELATED GTP- 
BINDING PROTEIN RHOD 
(RHO-RELATED PROTEIN 
HP 1 ) (RHOHP 1 ) sapiensl 


le-04 


1339 


U74382 


Human telomeric 
repeat DNA-binding 
protein (PIN2) 
mRNA. complete cds 


Ie-08 


<NONE> 


<NONE> 


<NONE> 


1340 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from PI 35 H5 C8) 
DNA sequence. 


le-08 


<NONE> 


<NONE> 


<NONE> 


1341 


L21936 


Human succinate 
dehydrogenase 
flavoprotcin subunit 


le-08 


3201678 


(AF060886) adenine 
phosphoribosyltransferase 
[Leishmania tarcntolae] 


4.0 


1342 


AB009777 


Homo sapiens gene 
for osteonidogen. 
promoter region 


le-08 


479388 


tritin - wheat 

>2i|391929|anl|PID|dl003454 


2.2 


1343 


M58600 


Human heparin 
cofactoi- II (HCF2) 
gene, exons 1 through 
5. 


le-08 


1730173 


GLUCOSE-6-PHOSPHATE 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Clarkia concinnal 


1.9 


1344 


M58600 


Human heparin 
cofactor II (HCF2) 
gene, exons I through 
5. 


le-08 


1730173 


GL UCOSE-6-PHOSPHATE 
ISOMERASE, CYTOSOLIC 2 
(GPI) (PHOSPHOGLUCOSE 
ISOMERASE) (PGI) isomerase 
[Clarkia concinna] 


1.7 


1345 


AC000980 


Homo sapiens 
(subclone 1 _e2 from 
PI H3I) DNA 
sequence 


le-OS 


439S77 


(L27428) reverse transcriptase 
(Homo sapiens] 


1.1 



TV 
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Nearest Neiehbor 'BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Pre 


teins) i 


SEQ 
ID 




DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1346 


U48734 


Human non- muscle 
alpha-actinin mRNA, 
complete cds 


le-08 


168237 


(M76546) hydroxyproline-rich 
protein [Helianthus annuusl 


0.19 


1347 


M76724 


Human leukocyte 
adhesion receptor 
alpha subunit 


le-08 


1177607 


(X92485) pval [Plasmodium ; 
vivaxl 


0.19 


1348 


AF067959 


Gallus gallus 
homeodomain protein 
HOXD-3 mRNA. 
complete cds 


le-08 


■ . 3165574 


(AF067942) No definition line 
found [Caenorhabditis elesansl 


0.15 


1349 


Z81014 


Human DNA 
sequence from 
cosmid U65A4. 
between markers 
DXS366 and DXS87 
on chromosome X * 


le-08 


2072964 


(U93569) putative pl50 [Homo 
sapiens] 


0.001 


1350 


X57103 


Human h-lys gene for 
Iysozyme (upstream 
reaion) 


7» no 




<NONE> 


<NONE> 


1351 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-09 


231629 


B ILE-S ALT- ACTIVATED 
LIPASE PRECURSOR ESTER 
LIPASE) (STEROL 
ESTERASE) (CHOLESTEROL 
ESTERASE) salt-activated 
lipase [Homo sapiens] sapiens] 


0.22 


1352 


L34741 


Aplysia californica 
prohormone 
convertase (PC2) 
mRNA. complete cds. 


5e-09 


322054 


cytochrome-c oxidase (EC 
1.9.3.1) chain II precursor - 
Synechocystis sp. (PCC 6803) 
>zi|581739 sp.] 


5.0 


1353 


AF052959 


Homo sapiens type 
XV collagen 
(COL15A1) gene, 
exon 6 


4e-09 


131269 


PHOTOS YSTEM II P680 
CHLOROPHYLL A 
APOPROTEIN (CP-47 
PROTEIN) 

>gi|7270S|pir||QJLV6A 
photosystem II chlorophyll a- 
binding protein psbB - liverwort 
(Marchantia polymorpha) 
chloroplast >gi| 11700 


l.S 
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Nearest Neighbor iBIastN vs. Gcnbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












r <_>_jj lu hvj .vim. i ii u hji; 




1354 


L 15470 


S treptomyces 
clavuligerus (NRRL 
3585) clavulanic acid 
biosynthesis protein 
(cla) gene, complete 
cus ana Luvamiiiaic 

synthase 2 (cs2) gene, 
partial cds. 


4e^09 


586028 


tAOMATINE 

LTREOHYDROLASE) (AUH) 
(PROCLAVAMINIC ACID 
AMID IN O HYDROLASE) 
>gi|1361423|pir||S57669 
Proclavaminic acid amidino 
hydrolase - Streptomyces 
clavuligerus >gi|295171 
Proclavaminic acid amidino 
hydrolase [Streptomyces 
clavuligerus] 

>gi|1586122|prfl|2203286B 

nrnrlnvnminic acid amidino 
hydrolase [Streptomyces 
clavuligerus] 


4e-13 


1355 


AB002302 


Human mRNA for 
K1AA0304 gene, 
complete cds 


2e-09 


131600 


GENERAL SECRETION 
PATHWAY PROTEIN L 
product [Klebsiella pneumoniae] 
>ai|149311 (M326l3)pulL 


2.5 


1356 


L34219 


Homo sapiens 
retinaldehyde-binding 
protein (CRALBP) 
gene, complete cds. 


le-09 


<NONE> 


<NONE> 


<NONE> 


1357 


AB002302 


Human mRNA tor 
KIAA0304 gene, 
complete cds 


le-09 


2224549 


(AB002302) KIAA0304 [Homo 
sapiens] 


5.0 


1358 


D85731 


Homo sapiens 
HSPA1L gene for 
Heat shock protein 70 
testis variant, 5'UTR. 
partial sequence 


lc-09 


1389766 


(U5865S) unknown [Homo 
sapiens] 


1.3 


1359 


AF064483 


Homo sapiens natural 
resistance-associated 
macrophage protein 2 
(NRAMP2) gene, 
exon 17. alternatively 
spliced non-CRE 
form, complete cds 


Se-10 


113671 


!!!! ALU CLASS F WARNING 
ENTRY !!!! 


0.72 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1360 


AF002283 


Mus musculus alpha- 
actinin-2 associated 
LIM protein mRNA, 
alternatively spliced 
product, complete cds 


oe-10 


2990 19b 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus 1 




1361 


M26220 


African green 
monkey origin of 
replication 


5e-10 


2143455 


gene DMR-N9 protein - mouse 
(fraement) 


8.8 


1362 


Z780O6 


H.sapicns flow-sorted 
chromosome 6 
HindlH fragment, 
SC6pA7F10 


4e-l0 


2072977 


(U93574) putative pl50 [Homo 
sapiens] 


0.005 


1363 


U82303 


Homo sapiens 
unknown protein 
mRNA, partial cds 


2e-10 


1825711 


(U88183) similar to the 
immunoglobulin supertamily, 
most similar to nerual cell 
adhesion proteins 
[Caenorhabditis eleaans] 


0.031 


1364 


AF079764 


Drosophila 
melanogaster 
enhancer of 
polycomb 


2e-10 


3757890 


(AF079764) enhancer of 
polycomb [Drosophila 
melanosasterl 


le-10 


1365 


L24123 


Homo sapiens NRFl 
protein (NRF1) 
mRNA. 


2e-10 


3004573 


(AC004~520.) similar to NFE2- 
related transcription factors*, 
similar to I4S694 
(PID:g2 137676) [Homo 
sapiens] 


4e-53 


1366 


M9U54 


Orangutan alpha- 
globin gene duplicate 
rcaion. 


le-10 


464239 


NADH-trBIOUINONE 
OXIDORFDUCTASE CHAIN 
4>gi|10851S5|pir||S5296S 
NADH dehydrogenase chain 4 - 
honeybee mitochondrion 
(SGC4) >gi|552446 (L06178) 
NADH dehydrogenase subunit 4 
[Apis mellifera ligustica] 


6.0 


1367 


D87117 


House mouse; 
Musculus domesticus 
brain mRNA for 
SAP 102. complete 
cds 


6e-ll 


473912 


(L3I961) phosphoprotein [Mus 
cookii] 


2.2 


1368 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-ll 


<NONE> 


<>!ONE> 


<NONE> 
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| Nearest 


Neighbor (BiastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pmi»n<i 


SEQ 
ID 


1 

1 ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1 1369 


1 AC00I002 


(subclone 2_h9 from 
p i w?q*» nwi 

~1 nJ7; urNrt 
sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 


1370 


1 AB007874 


Homo sapiens 
KIAA0414 mRNA. 
partial cds 


JC-1 1 


<NONE> 


<NONE> 


<NONE> 


1371 


1 AC001002 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 


1372 


AC0OI0O2 


Homo sapiens 
(subclone 2_h9 from 
rl HJ9) UNA 
sequence 


5e-U 


<NONE> 


<NONE> 


<NONE> 


1373 


AC001002 


Homo sapiens 
(subclone 2_h9 from 
rl Hjv) UNA 
sequence 


5e-ll 


<NONE> 


<NONE> 


<NONE> 


1374 


AC0OL0O2 


Homo sapiens 
(subclone 2_h9 from 
PI H39) DNA 
sequence 


5e-U 


<NONE> 


<NONE> 


<NONE> 


1375 


221852 


H.sapiens mRNA for 
HERV-fC long 
terminal repeat 


5e-ll 


419481 


gag polyprotein - human 
endogenous vims S71 


4.6 


1376 


AB00792S 


Homo sapiens mRNA 
for KIAA0459 
protein, partial cds 


5e-ll 


2947238 


(AF05 1782) diaphanous 1 
fHomo sapiens] 


2.8 


13771 


- 

D87H7 ( 


House mouse: 
Vlusculus domesticus 
brain mRNA for 
SAP 102, complete 
:ds 


5e-Il 


' ( 

473912 < 


L31961) phosphoprotein [Mus 
:ookii] 


l.S 


1378 1 


1 
s 
t 

AJ131501 F 


-lomo Sapiens DNA 
>equence between 
wo AiVlL 1 gene 
>romoicrs. 6423 BP 


5e-ll 


; 

728831 \ 


!!! ALU SUBFAMILY J 
VARNING ENTRY 


0.20 


1379 1 


I 
r 

M27S26 r 


-luman endogenous 
etroviral protease 
nRNA. complete cds. 


5e-ll 


r 

8855S 


etroviral proteinase-like protein 
human 


o.oo: 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor iBIastX vs. Non-Redundant Prnreinsi 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












HYFUTHLTHJAL 55 ' KD 




1380 


U23804 


Drosophila 
melanogaster putative 
GTP-binding 
regulatory protein 

mRNA. panial cds. 


5e-ll 


2494916 


TRP-AiP RtPtATS 
CONTAINING PROTEIN 

III protein; similar to G-Beta 
repeat region (Trp-Asp 
domains) of guanine nucleotide 
binding protein 


le-30 


1381 


Z22784 


M.musculus troponin 
I sene. 


"1^ 1 1 




(AF072889) transcription 
repressor brain factor 2 


0.053 


1382 


now / oou 


Homo sapiens 
KIAA0420 mRNA, 
complete cds 


-ic- 1 1 


<[NUPifc> 


<NONE> 


<NONE> 


1383 


AF02036I 


9 Homo sapiens BAX 
gene, exon 6, partial 
sequence 


2e-ll 


<NONE> 


<NONE> 


<NONE> 


1384 


L35600 


Homo sapiens DNA 
sequence. 


2e-ll 


1174952 


OLYC_OPROTEIN D 
PRECURSOR gD [Bovine 
herpesvirus L] 


0.25 


1385 




Human organic anion 

transporting 

polypeptide 




LI J8i2j 


(U9501 1) brain-specific organic 
anion transporter 


9e-19 


1386 


U90878 


Homo sapiens 
carboxyl terminal 
LIM domain protein 


2e-Il 


2996196 


(AF053367) carboxyl terminal 
LIM domain protein [Mus 
musculus] 


4e-23 


1387 


U31929 


Hnmin fimhin 

llUlllall'Ul Ul 1 al 1 

nuclear receptor 
(DAXl)gene, 
complete cds 


6e-12 


<NONE> 


<NONE> 


<NONE> 


1388 




Hum:m vnn 
Willebrand factor 
gene, exon 1. 2, and 
3, and three Alu 
rcpcuiive cicincnis- 


f,* to 

OC 1 ^ 




<NCJNfc> 


_ x rr~~\ k r r " 

<NONb> 


1389 


AB 020648 


hlomo sapiens mRNA 
for KIAA0S41 
protein, partial cds 


3e-l2 


<NONE> 


<NONE> 


<NONE=» 


1390 


Z 15026 


-l.sapiens genes for 
umor necrosis factor 
Tnfa) and 

ymphoto.xine (Tnfb) 


2e-l2 


<NONE> 


<NONE> 


<NONE> 


1391 


f 

L28101 ( 


4omo sapiens 
callistatin |PI4) gene. 
:xons 1-4. complete 
•ds 


2e-I2 


<NONE> 


<NONE> 


<NONE> 


1392 


1 

Z47046 ( 


-luman cosmid 
3LL2C9 from Xq2S 


2e-l2 


<NONE> 


<NONE> 


<NONE> 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H.sapiens tlow-sortec 










1393 


Z79007 


chromosome 6 
Hindlll fragment, 
SC6pA20E2 


2e-12 . 


106322 


hypothetical protein (L1H 3' 

rp ^J^l*\^\'\ - hniTiin 

lb!£lUIIJ (lUilliill 


1 -J 


1394 


U34377 


Human tyrosine 
kinase TXK (txk) 
gene, exon 13. 


le-12 


151484 


(M55524) ORF 4; putative 
[Pseudomonas aeruginosa] 


4.3 


1395 


D70845 


Mus musculus apg-1 
gene for novel 
member of heat shock 
protein 110. promoter 
region 


le-12 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALP) precursor • 
fungus (Acremonium 
chrysogenum) 


3.5 


1396 


M6397S 


Human vascular 

cnHoth^Iint ornwth 

WllUUtii^ 1 1(11 q1U*> ui 

factor aene. exon 8. 


le-12 


3982737 


(AF069731) calmodulin- 
dependent protein kinase II beta 
M isoform [Rattus norveaicusl 


0.083 : 


1397 


U60266 


Homo sapiens 
lysosomal alpha- 
mannosidase (manB) 

HUM < (l! 11 1/ In. lb V. Uj 


8e- 13 


<NONE> 




<INUiNC> 


1398 


Z68297 


Caenorhabditis 
elegans cosmid 
FUAI0. complete 
sequence 
[Caenorhabditis 
elegans) 


7e-13 


2393734 


(AC002542) similar to C. 
elegans Fl 1 A 10.5; SO^o 
similarity to Z68297 
(PID:gl 130619) [Homo 
sapiens] 


5e-34 


1399 


Z68297 


Caenorhabditis 
elegans cosmid 
Fl 1 A10. complete 
sequence 
[Caenorhabditis 
elegans] 


7e-13 


2393734 


(AC002542) similar to C. 
elegans Fl 1A10.5; 80% 
similarity to Z68297 
(PlD:gl 130619) [Homo 
sapiens] 


3e-38 


1400 


Z68885 


Human DNA 
sequence from 
cosmid L21FI2B, - 
Huntington's Disease 
Region, chromosome 
4pl6.3. contains 
EST. 


6e-13 


<NONE> 


<NONE> 


<NONE> 


1401 


X76104 


H.sapiens DAP- 
cinase mRNA 


6e-13 


2911154 


AB007 143) ZIP-kinase [Mus 
•nusculusl 


0.007 


1402 


( 

Z7S668 . 


-(.sapiens flow-sorted 
:hromosome 6 TaqI 
ragment. 
3C6pA13G4 


5e-I3 


106322 


nypothetical protein (L1H 3' 
egion) - human 


2e-06 


1403 


I 

L35600 ; 


-lomo sapiens DN'A 
.equenoe. 


3e-I3 


( 

3184290 i 


AC004136) hypothetical 
jrotein [Arabidopsis thaliana] 


1.7 



WO 01/02568 



PCT/USOO/18374 





Nearest Neighbor ('BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 

11 J 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Cloning vector 










1404 


AF090452 


pKODT complete 
sequence 


2e-l3 


3876730 


(Z49966) F35C11.4 
[Caenorhabditis elegans] 


7.8 


1405 


D28126 


Human gene tor A If 
synthase alpha 
subunit, complete cds 
(exon I to 12) 


2e-13 


419481 


gag polyprotein - human 
endogenous virus S71 


3.4 


1406 


AF005219 


Homo sapiens 
transcription factor 
HOXD13 


2e-13 


2822166 


(AC0O4O8O) transcription factor 
HOXA13 [Homo sapiens) 


5e-09 


1407 


ABO 18301 


Homo sapiens mRNA 
forKIAA0758 
protein, partial cds 


2e-'13 


3882237 


(AB0183O1) KIAA0758 protein 
[Homo sapiensl 


Ie-23 


1408 


D70845 


Mus musculus apg- 1 
gene for novel 
member of heat shock 
protein 1 10, promoter 
resion 


le-13 


113658 


ALKALINE PROTEINASE 
PRECURSOR (ALP) precursor - 
fungus (Acremonium 
chrvsoacnum) 


3.1 


1409 


AG000691 


Homo sapiens 
genomic DNA, 2 lq 
region, clone: 
TI71BG33 


8e-14 


930045 


(X15332) alpha- 1 (III) collagen 
[Homo sapiensl 


3e-04 


1410 


D30785 


Mouse mRNA for 
neuropsin, complete 
cds 


8e-I4 


3559978 


(AJ005641) serine protease 
[Rattus rattus] 


2e-12 


1411 


U32710 


Haemophilus 
influenzae Rd section 
25 of 163 of the 
complete genome 


8e-14 


4106673 


(AL035064) queuine trna- 
ribosyltransferase 
[Schizosaccharomyces pombe] 


2e-38 


1412 


AG000S86 


Homo sapiens 
genomic DNA. 2 lq 
region, clone: 
64EI1X19 


7e-14 


1363925 


hypothetical protein 2 - North 
American opossum (fragment) 
>gi|897721 (Z48955) ORF-2, 
putative RT [Didelphis 
virainiana) 


l.l 


1413 


Z62664 


H.sapiens CpG DNA. 
:lone 7 1 d 1 1 , forward 
read cpg71dl 1 - ft 1 a . 


7e-14 


3953461 


AC00232S) F20N2.6 
Arabidopsis thaliana] 


0.085 


1414 


ABO 14532 | 


-lomo sapiens mRNA 
r or KIAA0632 
jrotein, partial cds 


7e-14 


113668 1 


!!! ALU CLASS C WARNING 
ENTRY !!!! 


0.040 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1415 


Z96478 


H.sapiens telomeric 
DNA sequence, clone 
20PTEL004, read 
20PTELOO004.seq 


7e-14 


2981631 


(AB0I2223) ORF2 [Canis 
familiaris] 


2e-04 


1416 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1417 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


4e-14 


<NONE> 


<NONE> 


<NONE> 


1413 


AF033349 


Homo sapiens MLL 
gene breakpoint 
cluster region, intron 
1. partial sequence 


3e-14 


72883 1 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


9.3 


1419 


AC001526 


Homo sapiens 
(subclone 4_f6 from 
PI H54) DNA 
sequence 


3e-14 


99861 


extensin - almond >gi|20420 
(X65718) extensin 


9.2 


1420 


AFO 12899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-14 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


0.15 


1421 


AF 100694 


Mus musculus 
Pomin52 mRNA. 
complete cds 


2e-l4 


3913573 


EPHRIN-A2 PRECURSOR 
(EPH-RELATED RECEPTOR 
TYROSINE KINASE LIGAND 
6) (LERK-6) sapiens] 
>gi|292476l (AC004258) 
EPL6_HUMAN [Homo sapiens] 


8.7 


1422 


t 
f 

AF012S99 r 


Sambucus nigra 
ibosome inactivating 
jrotein precursor 
nRNA. complete cds 


9e-15 


119040 It 


b 1 B HKU I EIN . SM ALL T-" " 

ANTIGEN (E IB 19K) 
>ai|74I42|pir||Ql AD25 early 
:1B 2 IK protein II - human 
idenovirus 5 >gi|5S489 
X02996) mRNA 5 first reading 
"rame [Human adenovirus type 
5] adenovirus type 5] 
>gi|209797 (JO 1969) 21 kD 
>rotein 


1.5 
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Nearest 


Neighbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 




SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 






P VALUE 














transcription factor GATA-4, 






1423 


AFO 12899 


Sambucus nigra 
ribosome inactivaiing 
protein precursor 
mRNA, complete cds 


OC- 1 J 


477102 


retinoic acid-inducible - mouse 
>gi|293345 (M98339) GATA- 
binding transcription factor 
[Mus musculus] 


0.57 




1424 


AB012223 


Canis familians LINE 
1 element ORF2 
mRNA. complete cds 


8e-15 


92385 


hypothetical protein - rat 
(fraement) 


0.003 


1425 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e- 13 


<NONE> 


<NONE> 


<NONE> 




1426 


XI 243 3 


Human pHSl-2 
mRNA with ORF 
homologous to 
membrane receptor 
proteins 


3e-15 


422532 


collagen alpha 3(IV) chain - sea 
urchin 


8.9 




1427 


AFOI2899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


3e-15 


1353143 


UU/M3 K DT g CTTT7 '1 L' I n 

HORMONE RECEPTOR 
E02H1.7 

>gi|3875431|gnl|PID|el344980 
(Z47075) similar to Zinc finger. 
C4 type (two domains) 
Cacnorhabditis elegans] 


5.0 


1428 


Z69651 


sequence from 
cosmid L75B9. 
Huntington's Disease 
Region, chromosome 
4pl6.3 


3e-15 


403460 


(L24521) transformation-related 
protein [Homo sapiens | 


0.60 


1429 


AF0I2899 


Sambucus nigra 
ribosome inactivating 
Drotein precursor 
mRNA. complete cds 


2e-15 


108750 


o hr*;ivv rh.nn or^fiirsor 

(B/MT.4A.17.H5.A5) - bovine 
>gi|440 (X62916) anti- 
testosterone antibody [Bos 
taurus] 


1.1 


1430 


X83299 


ri.sapiens SMA3 
mRNA 


2e-15 


671530 


X83299) SMA3 gene product 
Homo sapiens] 


0.32 


1431 


U01377 i 


-luman p300 protein 
tiRNA, complete cds. 
> :: gb|I62297|I62297 
sequence 1 from 
patent US 565S7S4 


2e-l5 


1 

3024341 1 


E1A-ASSOCIATED PROTEIN 
3 300 


0.019 
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Nearest Neighbor fBlastN vs. Genhanl^ 



SEQ 

*P I ACCESSION 



DESCRIPTION 



P VALUE 



1432| X16516 



Mouse MHC (Qa) Q2 
k gene for class I 
[antigen, exons 4-8 



1433 1 M74165 



Chicken tensin 
mRNA. complete cds 



1434 1 X71893 



iH.sapiens gene for 
■immunoglobulin 
(kappa light chain 
I variable region 04 
land 05 



. Nearest Neighbor (BlastX vs. Non-Redundant fW inti 



ACCESSION 



DESCRIPTION 



H i fU 1 Hh 1 HAL 4J. 1 KJJ- 
VKUlfclfl LlbUU.b Ifl 



P VALUEl 



le-I5 



2496897 



Ie-f5- 



283920 



CHROMOSOME III 
>gi|3874384|gnl|PCD|e 1 344078 
EST EMBL:C08256 comes 
from this gene; cDNA EST 
EMBL:C0994l comes from this I 
Igene; cDNA EST yk340al0.3 
comes from this gene; cDNA 
EST yk340al0.5 comes from 
this gene [Ca... 



7e-08 



9e-16 



<NONE> 



tensin - chicken >gi|2 12752 
(M74165) tensin 



2e-19 



<NONE> 



1435 1 UQ5227 



(Human Rar protein 
JmRNA. complete cds. I 



9e-16 



3036779 



(z.544/y; match: multiple 

proteins; match. 000407 
IQ12829 P22127 P36861 
Q40219; match. P70550 
[Q41022 P22125 Q08155 
IP352S6; match: P5U4S P51147| 
P35293 P36861 P352S9; match:, 
P35284 Q402I7 P5U52 
P51157 PS1 158: match: O41022I 



I <NONE> 



1436 1 M23404 



[Chicken erythrocyte 
(anion transport 
(protein (band3) 
[mRNA. complete cds. I 



9e-16 



726403 



((U23I75) similar to anion 
(exchange protein 
[[Caenorhabditis elegansl 



,1437 1 X16I45 



Rat mRNA for liver a-j 
IL-Fucosidase (EC 
|3.2.1.51) 



9e-I6 



67502 



aipha-L-r'ucosidase (EC 
13.2.1.51) 1 precursor, tissue - 
human >gi| 178409 (M29S77) 
|alpha-L-f"ucosidase precursor 
(EC 3.2.1.5) fHomo sapiens] 



le-28 



2e-29 



Sambucus nigra 
ribosome inactivating | 
protein precursor 
1438 I AF0I2 899 [mRNA. complete cds I 



1439 1 AF07698 1 



IMus musculus brain 
(mitochondrial carrier 
(protein BMCPI 
(Bmcpl) mRNA, 
[complete cds 



8e-16 



<NONE> 



<NONE> 



8e-!6 



3851540 



((AF078544) brain mitochondrial! 
Icarrier protein- 1 |Homo sapiensll 



2e-13 
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Nearest Neighbor (BlastN vs. Genbank) 


1 Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


OCA 

ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






H.sapiens .V1N/CA9 






!!!! ALU SUBFAMILY J 




1440 


254349 


GENE 


5e-I6 


728831 


WARNING ENTRY 


0.002 


1441 


AF077003 


Mus musculus SH3 
domain-containing 
adapter protein 
rrtRNA. complete'cds 


3e-16 


309123 


(M35526) complement 
component C5D [Mus 
musculus] 


3.1 


1442 


X64587 


M. musculus mRNA 
for splicing factor 
U2AF (65 kD) 


3e-I6 


2143767 


glycoprotein - rat >gi|986943 
(L08134) glycoprotein [Rattus 
norveeicus] norvegicusl 


0.003 


1443 


ABO 14561 


Homo sapiens mRNA 
forKlAA0661 
protein, complete cds 


3e-16 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


le-20 


1444 


Z739S7 


Human UNA 
sequence from 
cosmid N120B6 on 
chromosome 22 
Contains ESTs. 
complete sequence 
[Homo sapiens] 


le-16 


<NONE> 


<NONE> 


<NONE> 


1445 


M58318 


Homo sapiens ala 
gene. 


le-16 


<NONE> 


<NONE> 


<NONE> 


1446 


U44103 


Human small GTP 
binding protein Rab9 
mRNA. complete cds 


le-16 


1552584 


(Z80233) hypothetical protein 
Rv0029 


1.3 


1447 


ABO 14561 


Homo sapiens mRNA 
forKIAA0661 
protein, complete cds 


9e-17 


3327136 


(AB0I4561) KIAA0661 protein 
Homo sapiensl 


2e-20 


I44S 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-17 


<NONE> 


<NONE> 


<NONE> 


1449 


M76762 


Mus musculus 
ribosomal protein (Ke 
3) gene, exons 1 to 5. 
and complete cds. 


le-I7 


1073048 


pupR protein - Pseudomonas 
putida >gi|525260 


0.36 


1450 


D50561 


Human DNA. 
replication enhancing 
element (REE1) 


4e-l8 


126295 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG 


0.7S 


1451 


D 16431 


Human mRNA for 
lepatoma-derived 
growth factor, 
complete cds 


4e-18 


3242079 


'AJ0069S4) proline-rich protein 


0.01 S 



^-1 ^ 
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Nearest Neighbor rBlastN vs. Genbartk) 


. Nearest Neighbor <BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1452 




Mus musculus heat 
shock protein hsp40-_ 
mRNA. complete cds 


4e-18 


3873707 


(Z73102) Similarity to B.subtili; 
DNAJ protein 

(SW:DNAJ_BACSU); cDNA 
EST yk437al.5 comes from this 
gene [Caenorhabditis elesans] 


9e-25 


1453 


U60205 


Human methyl sterol 
oxidase (ERG25) 
mRNA. complete cds 


3e-18 


<NONE> 


<NONE> 


<NONE> 


1454 


AF038177 


Homo sapiens clone 
23899 mRNA 
sequence 


le-18 


1360775 


G protein-coupled receptor 74 - 
equine herpesvirus 2 >gi|695246 
(U20824) G protein-coupled 
receptor [Equine herpesvirus 21 


5.1 


1455 


ABO 14561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 


3327136 


(ABO 14561) KIAA0661 protein 
[Homo sapiens] 


le-2l 


1456 


AB014561 


Homo sapiens mRNA 
for KIAA0661 
protein, complete cds 


le-18 


3327136 


(ABO 1456 1) KIAA0661 protein 
[Homo sapiens] 


le-22 


1457 


U34374 


Human tyrosine 
kinase TXK (txk) 
Bene, e.xons 9 and 10. 


le-19 


<NONE> 


<NONE> 


<NONE> 


1458 


AB006969 


Hnmn vlnii*ns 

hGAAl mRNA. 
complete cds 


le-19 


4151809 


(AF102855) synaptic SAPAP- 
interactins protein Svnamon 


0.19 


1459 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


le-19 


2224531 


(AB002293) KIAA0295 [Homo 
sapiens] 


6e-l7 


1460 


259664 


H. sapiens CpG DNA. 
-lone 16Sr9. reverse 
read cp.2l6Sf9.rtla . 


5e-20 


3880251 


ZS2055) predicted using 
Genefinder 


6.5 


1461 


( 

M73S37 i 


■luman modulator 
•ecognition factor 2 
MRF-2) mRNA. 
•omplete cds. 


5e-20 


284313 1 


nodulator recognition factor 2 - 
luman factor 2 [Homo sapiens] 


0.019 



^7 3 
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SEQ 
ID 


Nearcs 
ACCESSrO 


t Neighbor fBlastN v S . 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neig 
ACCESSION 


hbor.BlastX v s . Non-Rerhmrfnn, 
DESCRIPTION 


Proteins) I 

P value! 


14621 U24267 


Human pyrroIine-5- 

carboxylate 

dehydrogenase 


5e-20 1 2506350 


"DELlA-l-i 1 IKHULlWh-J- 

CARBOXYLATE 

DEHYDROGENASE 

PRECURSOR (P5C 
. DEHYDROGENASE) 
o'i* JJJ «*^o zmpiensj 

>gi|I353250 (U24267) pyrrolin 

5-carboxylate dehydrogenase 

[Homo sapiens] 

>gi|OoyDH5|prf||221 1355A 
Deltal-pyrroline-5-carboxylate 
dehydrogenase [Homo sapiensl 


e 1 


1463 1 rii-wci 


Mus musculus myelin 
gene expression 
factor 


^-20 1 536926 


(U 13262) myelin gene 
expression factor [Mus 
Imusculus] 


5c-04 j 
3e-Q7 1 


14641 U1326' 1 


Mus musculus myelin 
gene expression 
factor 


4e-20 1 3126878 


(AF061S32) lM4 protein 
deletion mutant [Homo sapiens 


le-08 1 


14651 Z61239 


H.sapicns CpG DNA. 
clone 48fi0. forward 
read cps48fl0.ftla . 


4 e-20 1 1669601 


(DS8747) AR401 [Arabidopsis 
thalianal 


8e-I9 I 


1 

.14661 US9915 


VIus musculus 
unctional adhesion 
-nolecule (Jam) 
■nRNA. complete cds 


1 < 

le-20 I 3462455 


U899 15) junctional adhesion 
nolecule [Mus musculusl 


7e- 1 1 1 


j < 

1 a 

1467 I AF02907I r 


jallus gallus p52 pro- 
pototic protein 
nRNA. complete cds 


7e-?2 1 2599492 p 


AF029071) P 52 pro-apototic 
rotein [Gallus gallus] 


le-I5 1 


1 F 
1 s 

1 P 

1468 1 M25636 ir 


igure 4. Nucleotide 
equence of the 
KS36 1.797 kb 
sert. 


1 (f 

6e-22 1 1196398 IF 


vI21305) unknown protein 
lorno sapiens) 


065 1 


1 H 

1 fo 
1469| pr 


omo sapiens mRNA 
r KIAA0S4S 
otein. complete cds 


I (A 

6<:-22 | 4240325 th 


lB020725) KIAA091S protein • 
omo sapiens) I 


le-19 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












PKULULLAGfc.V. ALPHA 




1470 


S 80935 


chorionic 

gonadotropin beta 1 
(CG beta I) subunit 


5e-22 


115310 


1U v; CHAIN PRECURSOR 1 
>gi|S4917|pir||A31S93 collagen 
alpha 1(IV) chain precursor - 
fruit fly (Drosophila 
melanogaster) melanogasterj 
>gi| 157078 (M96575) type IV 
collagen pro-collagen 
(Drosophila melanoaaster] 


0.027 


1471 


AF053066 


Homo sapiens 
microsatellite 
D5S2926 sequence 


2e-22 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


3e-04 


1472 


U55177 


Danio rerio carbonic 
anhydrase homolog 
CAH-Z mRNA, 
complete cds 


2e-22 


3123190 


CARBONIC ANHYDRASE 
(CARBONATE 

DEHYDRATASE) >gi|2576335 
(U55I77) CAH-Z [Danio rerio] 


5e-14 


1473 


AF064250 


Gallus gallus 
ubiquitin specific 
protease 66 


2e-22 


2736064 


(AF016107) ubiquitin specific 
protease 41 [Gallus aallusl 


7e-37 


1474 


AF030880 


Homo sapiens 
pendrin (PDS) 
mRNA, complete cds 


2e-22 


729367 


lJKAPRUlLIN vUOvwV 
REGULATED IN ADENOMA) 
>ai|2135020|pir||A47456 down- 
regulated in adenoma (DRA) - 
human >gi|29 1964 (L027S5) 
Nuclear localization signal at 
AA 569-573. 576-580. 579-583; 
acidic iranscr. activ. domain 620 
640.: homeobox motif 653-676 
[Homo sapiensl 


4e-53 


1475 


AF 100694 


Mus muse ul us 
Pontin52 mRNA, 
complete cds 


6e-23 


<NONE> 


<NONE> 


<NONE> 


1476 


X57398 


Human- mRNA for 
pM5 protein 


3e-23 


107350 


Pm5 protein - human 
>gi|1335273|gnl|PID|e3624l 


le-04 


1477 


ABO 10998 


Rattus norvegicus 
PAD-R11 mRNA for 
Peptidylarginine 
deiminase type I, 
complete cds 


2e-23 


<NONE> 


<NONE> 


<NONE> 


147S 


D 1087 I 


iuman h NAT allele 
2-2 gene for 
irylamine N- 
icetvltransferase 


2e-23 


( 

171200 ( 


J04734) CDC6 protein 
Succharomvces cerevisiae] 


9.8 


1479 


D10S71 


iuman h NAT allele 
2-2 gene for 
irylamine N- 
icetvltransferase 


2e-23 


( 

171200 [ 


J04734) CDC6 protein 
Saccharomvces cerevisiae) 


S.3 



xis 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



USQj AF 02454 1 
148 1 1 LI 3773 



DESCRIPTION 



P VALUE 



Homo sapiens MIX 



14821 AF100694 



1483 1 U75467 
1484 1 D| 7076 



AF4 fusion protein 
mRNA. partial cds 

Human AF-4 mRNA, 
complete cds, 



Mus musculus 
Pontin52 mRNA. 
complete cds 



Drosophila 
melanogaster Rga and 
Atu genes, complete 

cds 

Human HepG2 partial 
cDNA, clone 
hmd5a09m5 



Nearest Neighbor (BlastX vs. Non-Redundant Pr^~T 



ACCESSION 



2e-23 



2e-23 



8e-24 



DESCRIPTION 



pvalueI 



jserine/proline-rich FEL protein. 
2 136 142 [splice form 1 - human 

|(AF03 1404) MLL-AF4 fusion 
3063962 protein [Homo sapiens] 



<NONE> 



<NONE> 



8e-24 



7e-24 



(U75467) Atu [Drosophila 
J658503 |melanogasterI 



<NONE> 



<NONE> 



1485 1 AFI0Q694 



14861 Ml 1167 
14871 AF 100694 



1488 1 AB00346S 



1489| X03541 



1490| LS1652 



1491 | U95760 



1492| AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Human 28S 
ribosomal RNA gene 
Mus musculus 
Pontia52 mRNA, 
complete cds 



7e-24 



1 169643 



PMT5?amTdE^5£CaTED — 
NEUROPEPTIDES 
PRECURSOR >gi|41620S 
[(U03137) neuropeptide 
precursor FMRFamide-related 
peptide fLymnaea stagnalis] 



Cloning vector 
pAP3neo DNA. 
complete sequence 



-luman mRNA of trk 
oncogene > :: 
gb|I96186|I96186 
Sequence 23 from 
patent US 5734039 



Homo sapiens 
(subclone 2_g 1 1 from 
PI H43) DNA 
sequence 



Drosophila 
melanogaster 
strawberry notch 
sno) mRNA. 
complete cds 



us musculus 
ntin52 mRNA, 
com plete cds 



Pon 



2e-24 



2e-24 



I(ZSI054) predicted using 
jGenefinder; Similarity to UDP 
_387548 1 [glucoronosyltransferases 

USP1 PROTEIN PRECURSOR 
549173 |>gill 6962 3 



le-20 



<NONE>| 



2e-37 



7e-10 



2e-24 



987050 



HX65335) lacZ gene product 
[[unidentified cloning vector] 



2e-24 



325465 



2e-24 



225047 



2e-24 



2078282 



8e-25 



2623773 



(M74509) [Human endogenous 
retrovirus type C oncovirus 
sequence.], gene product [Homo 
sapiens] 



reverse transcriptase related 
protein [Homo sapiens] 



(U95760) Sno [Drosophila 
melanogaster] 



(AFO04S35) tyrocidine 
synthetase 3 [Brevibacillus 
bieyis] 



5.1 



0.05S 



3e-04 



4e-12 



2e-41 



8.6 



UK 



WO 01/02568 
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SE( 
ID 


1 Nearest Neiahbor (BlastN v S . 
2 | 1 

1 accession! description 


Genbank) 
P VALUE 


Nearest Neial 
: ACCESSION 


ibor (BlastX vs. Non-Redundant I 
DESCRIPTION 


'roteins) 
P VALUE 


149: 


i| AB002405 


Homo sapiens mRNr 
for LAK-4p, 
|complete cds 


8e-25 


2496822 


HYPOTHETICAL 127.3 KD~ 
PROTEIN B0416.1 IN 
CHROMOSOME X >gi|7465a 
(U23516) B04I6.1 gene produc 
_ [Caenorhabditis elegansl 


> 
t 

9e-U 


1494 


I K03002 


Human mRNA from 
(chromosome 15 gene 
with homology to 
MHC-HLA-SB-l 
lintron A. 


8e-25 


1514614 


(X92842) nuclear protein [Mus 
musculus) 


le-13 


.1495 


I U61232 


iHuman tubulin- 
Ifolding cofactor E 
ImRNA. complete cds 


7e-25 


1465772 


(U61232) cofactor E [Homo 
sapiens) 


2e-05 


1496 


U I 0245 


[Arabidopsis thaliana 
Col-O putative RNA 
helicase A mRNA, 
[complete cds. 


5e-25 


1353239 


(U10245) putative RNA 
helicase A [Arabidopsis 
thaliana) 


le-37 


1497 


XS9211 


H.sapiens DNA for 
endogenous retroviral 
like element 


- 

3e-25 


">0652 10 


(Y 127 13) Pro-Pol -dUTPase 
polvprotein 


5e-06 


1498 


L8I652 


Homo sapiens 
(subclone 2_gl 1 from 
P1H43)DNA | 
sequence 


3e-25 


2072961 


(U93568) putative pl50 [Homo 
sapiens 1 


5e-16 


1499 1 


XS2895 


ii. sapiens mRNA for 
DLG2 


2e-25 


2497511 


MAGUK P55 SUBFAMILY 
MEMBER 2 (MPP2 PROTEIN) 
DISCS, LARGE HOMOLOG 

1) 


le-34 


1500| 


1 

M36654 c 


VIouse homeo box j 
2.6 (Ho.x-2.6) mRNA. 
■omplete cds. | 


ye-— O 


( 

3323169 j 


AE00 1255) T. pallidum 
iredicted codine region TP0854 


1.9 


1501 1 


I 

r 

L36315 p 


vlus musculus (clone 1 
>MLZ-1) zinc finger 1 
rotein 1 


9e-26 


( 

1806134 f 


Z67747) zinc finger protein 
Mus musculus) 


4e-05 


1502 


I- 

r 

AB018281 p 


lomo sapiens mRNA 
ar KIAA0738 
rotein. complete cds | 


9e-26 


i 

728S3I V 


!! ALU SUBFAMILY J 
'ARNING ENTRY 


le-07 


1503 1 


b 
P 

AFO 17433 fc 


omo sapiens 1 
utative transcription 1 
ctorCR53 | 


9e-26 


Z 

32I99S5 2 


INC FINGER PROTEIN ZFP- 


le-17 



Ul 
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SEC 
ID 


Nearest 

! 

ACCESSIOf 


Neighbor (BlastN vs. 
>I DESCRIPTION 


*ienhankJ 

uwi luuimf 

P VALUE 


Nearest Neiah 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


totems) 
P VALUE 


1504 


AC001225 


Homo sapiens 
(subclone 2_c6 from 
BAC H94) DNA 
sequence 


8e-26 


2653713 


(U91823) small S protein 
[Hepatitis B virus] 


4.3 


1505 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


8e-26 


283446 


cyteine-rich surface antigen 72, 
CRP72 - Giardia lamblia 
(fragment) 


3.4 


1506 


X949I2 


H. sapiens Pr22 eene 


3e-26 


728837 


!!!! ALU SUB FAMILY SQ 
WARNING ENTRY 


4e-09 


1507 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-26 


• <NONE> 


<NONE> 


<NONE> 


1508 


U44103 


Human small GTP 
binding protein Rab9 
mRNA. complete cds 


le-26 


3327038 


(AB0145I2) KIAA0612 protein 
[Homo sapiens) 


8.7 


1509 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-27 


4056454 


(At_U03990; Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and «b|Z 18788 
come from this gene. 
[Arabidopsis thaliana) 


0.14 


1510 


AG001212 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
9H11N46 


9e-27 


126296 


LINE-1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
TNvctioebus coucans) 


0.012 


1511 


AF027131 


Mus musculus mucin 
glycoprotein MUC3 
mRNA. panial cds 


9e-27 


2589172 


(U76551) mucin Muc3 [Rattus 
norveaicus) 


2e-14 


1512 


• 

U49057 < 


?attus norvegicus 
CTD-binding SR-like 
protein rA9 mRNA. 
:ompIete cds 


5e-27 


■ 

1438534 


U49057) rA9 [Rattus 
norveaicusl 


le-04 


1513 


I 

J03764 s 


■luman, plasminogen 
ictivator inhibitor- 1 
tene. exons 2 to 9. 


3e-27 . 


<NONE> 


<NONE> 


<NONE> 


1514 


f 
c 

Z7S160 ( 


vl. musculus partial 
ochlcar mRNA 
clone 2SD2) 


3e-27 


( 

1490362 r 


Z7SI60) unknown [Mus 
nusculusl 


2e-05 


1515 


h 

C 

Z64210 r 


{.sapiens CpG DNA. 
lone 99b4. reverse 
:ad ep.299b4.ri I a . 


3e-27 


( 

5 

F 

225753S f 


AB00453S) LIPOIC ACID 
YNTHETASE 
RECURSOR(LIP-SYN) 
Schizosaccharomvces pombe) 


le-06 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1516 


L35659 


(subclone H8 6_h6 
from PI 35 H5 CS) 
DNA sequence. 


le-27 


<NONE> 


<NONE> 


<NONE> 


1517 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


1644471 


(U72686) odorant receptor 4 
(Danio rerio] 


7.5 


1518 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


2738388 


(AF003534) hypothetical 
protein 004L [Chilo iridescent 
virus] 


6.7 


1519 


AB009271 


Homo sapiens gene 
for BCNT. partial cds 


le-27 


3880909 


(AL032636) Y40B1B.3 
[Caenorhabditis elegans] 


4.6 


1520 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.85 


1521 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


121805 


ENDOGLUCANASE A 
PRECURSOR 


0.58 


1522 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
comt>leie cds 


le-27 


3722000 


(AF035323) survival motor 
neuron protein [Bos taurus] 


0.10 


1523 


AFI00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-27 


3328188 


(AF074902) laminin alpha chain 
Caenorhabditis elegans] 


0.083 


1524 


AF074382 


Homo sapiens IkB 
kinase samma subunit 


le-27 


3641280 


(AF0743S2) IkB kinase gamma 
subunit [Homo sapiens] 


0.041 


1525 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-27 


4056454 


(ACU059SMJ) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|21S788 
come from this gene. 
[Arabidopsis thalianal 


6e-04 


1526 


L78778 


Homo sapiens 
(subclone 2_eI0 from 
PI H49) DNA 
sequence 


le-27 


225047 


reverse transcriptase related 
protein [Homo sapiens] 


2e-09 


1527 


L03427 


Human zinc finger 
protein basonuclin 
mRNA. complete cds. 


le-27 


14S8275 


;U59694) zinc finger protein 
basonuclin [Homo sapiens] 


9e-22 


152S 


U09954 


rluman ribosomal 
jrotein L9 gene. 5' 
region and complete 
:ds. 


4e-2S 


2257533 


(AB00453S) LIPOIC ACID 
SYNTHETASE 
PRECURSORCLIP-SYN) 
Schizosaccharomyces pombe] 


2e-04 
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Nearest Neighbor 'BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 

id 


ACCESSION 


f DESCRIPTION 


P VALUE 






P VALUE 
















1529 


Z64210 


H.sapiens CpG DNA, 
clone 99b4. reverse 
read cp«99b4.rtla . 


4e-28 


3878570 


(Z46j_l) similar to lipoic acid 
synthase; cDNA EST yk283b6.: 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene: cDNA EST yk472f5.3 
comes from this gene; cDNA 
EST yk472f5.5 comes from this 
gene; cDNA EST yk476e7.3... 


7e-U 


1530 


U55177 


Danio rerio carbonic 
anhydrase homo log 
CAH-Z mRNA. 
complete cds 


4e-28 


3123190 


CARBONIC ANHYDRASE 
(CARBONATE 
DEHYDRATASE) >gi|2576335 
(U55177) CAH-Z [Danio reriol 


5e-2l 


1531 


D4Jo8_ 


Human mRNA for 
very-long-chain acyl- 
CoA dehydrogenase 
(VLCAD). complete 
cds 


4e-28 


1351839 


ACYL-COA 

DEHYDROGENASE. VERY- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLC.AD) 
>gi|93035S taurus] 


3e-27 


1532 


AFO 16591 


Homo sapiens 
survival motor neuron 
pseudogene. complete 
sequence 


3e-28 


728831 


!"' ALU SUBFAMILY J 
WARNING ENTRY 


3e-08 


1533 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


2e-2S 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


2.5 


1534 


AF 100694 


Mus musculus 
Ponttn52 mRNA. 
complete cds 


2e-28 


1 IR^Sfi 
l tojoo 


DEHYDRIN DHN3 
>gi|l00035|pir||SlS139dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


U.UU4 


1535 


AF 1 00694 < 


VI us musculus 
Pontin52 mRNA, 
:omplete cds 


2e-2S 


1 169643 


FMkFAMlDE'-RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|4I6208 
'U03137) neuropeptide 
precursor FMRFamide-related 
septidc [Lvmnaca staanalisl 


6e-04 


1536 


r 
i 

AF 100694 c 


vlus musculus 
J ontin52 mRNA. 
omplete cds 


2e-28 


i 

! 
! 
f 
c 

4056454 [ 


A(J00599Uj Contains repeated 
egion with similarity to 
?b|U43627 extensin (atE.\tl) 
>ene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
ome from this gene. 
Arabidopsis thaliana] 


9e-05 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACUO5W0) Contains repeated 




1537 


AF 100694 


Mus musculus 
Pontio52 mRNA, 
complete cds 


2e-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbiZ 18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-06 


1538 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28' 


4056454 


(AC003990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gbjZ 18788 
come from this gene. 
(Arabidopsis thaliana] 


2e-09 


1539 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


(AC0O599O) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZl8788 
come from this gene. 
[Arabidopsis thaliana] 


le-09 


1540 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


2e-28 


4056454 


(AC0U5990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana] 


5e-I0 


1541 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


4056454 


(AC0O5990J Contains repeated 
region with similarity to 
gb|(J43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbiZ18788 
come from this gene. 
[Arabidopsis thaliana] 


le-11 


1542 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-28 


3157926 


(AC002 13 1) Strong similarity to 
extensin-like protein gb(Z34465 
from Zea mays. [Arabidopsis 
thaliana] 


8e-12 


1543 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


Ie-28 


<NONE> 


<NONE> 


<NONE> 


1544 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1545 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-2S 


<NONE> 


<NONE> 


<NONE> 



is I 
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Nearest Neighbor ( BlastN vs. Gcnbank) 


Nearest Nciahbor (BlastX vs. Non-Rcdundani Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1546 


AF100694 


Poniin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1547 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1548 


AF100694 


Mus musculus 
Pomin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1549 


AF 1 00694 


Mus musculus 
Pomin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NON"E> 


1550 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1551 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1552 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1553 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1554 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1555 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1556 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1557 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-2S 


<NONE> 


<NONE> 


<NONT£> 


1558 


AF 100694 


Mus musculus 
Pomin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1559 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


<NONE> 


<NONE> 


<NO.\~E> 


1560 


AF 100694 


Vlus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1561 


AF 100694 


vl us musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NON"H> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Nciahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1562 


AF 100694 


Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1563 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


lc-28 


<NONE> 


<NONE> 


<NONE> 


1564 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


.1565 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


. <NONE> 


<NONE> 


<NONE> 


1566 


M87708 


Human simple repeat 
polymorphism. 


le-28 


<NONE> 


<NONE> 


<NONE> 


1567 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


<NONE> 


<NONE> 


<NONE> 


1568 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


3924779 


1."vi_uuujuj,j ininiai iu uiiiuiiii — 
B; cDNA EST yk450d8.5 comes 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST ylc224f4.5 
comes fr... 

>gi|39248S 1 |gnl|PID|e 1 354569 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA'EST yk224t"4.5 
comes from... 


3.0 


1569 


AF 100694 


vlus musculus 
Pontin52 mRNA, 
:omplete cds 


le-28 


1 169643 


FMRFamIDE-RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|4 16208 
U03137) neuropeptide 
precursor FMRFamide-related 
peptide [Lvmnaea stagnalis] 


0.66 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












■ uvj_uuujo_i i Milium iu uiiiniin — 




1570 


Ar 100694 


Mus musculus 
Pontin52 tnRNA, 
complete cds 


le-28 


3924779 


■ B, cDNA E3T >UJ0d3.j conrcs 
from this gene; cDNA EST 
yk249a6.5 comes from this 
gene; cDNA EST yk219a2.5 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes fr... 

ioillQ'MSSllonllPmli-l T54^fi9 

from this gene; cDNA EST 
yk249a6.5 comes from this 
gene, cUiN A bo I yic.2iya2.3 
comes from this gene; cDNA 
EST yk355e4.5 comes from this 
gene; cDNA EST yk224f4.5 
comes from... 


0.65 


1571 


a c i nn£n.i 
Ar 1UU094 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


21jj579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.49 


1572 


Ar 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.49 


1573 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


283446 


cyteine-rich surface antigen 72, 
CRP72 - Giardia lamblia 
(frasment) 


0.45 


1574 


AF 100694 


^(us musculus 
Pontin52 mRNA, 
complete cds 


le-28 


2498937 


SPERMATOPHORIN SP23 
rKfc^_UKitjK mealworm 
>gi|161725 (M92928) structural 
protein 


0.33 


1575 


AF 1 00604 


Mus musculus 
Pontin52 mRNA, 


le-28 




(U603I5) MC107L [Molluscum 
contagiosum virus subtype I] 


U. 1 o 


1576 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


o.oss 


1577 


AF 100694 


VIus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


113588 


DEHYDRIN DHN3 
>gi|100035|pir||Sl8139 dehydrin 
DHN3 - garden pea >gi|20709 
1X63063) pea dehydrin DHN3 
Pisum sativum] 


0.0 IS 


157S 


AF 100694 


VIus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


1 185SS 


DEHYDRIN DHN3 
>gi|100035|pir||SIS139 dehydrin 
DHN3 - garden pea >gi|207O9 

X63063) pea dehydrin DHN3 

Pisum sativum] 


0.016 
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Nearest Neighbor fBlastN vs. Genbank) 



SEQ 

ID I ACCESSION | DESCRIPTION I p VALLfE 



Nearest Neighbor (BlastA vs. Non-Redundant~Pr^o" 



ACCESSION 



DESCRIPTION 



IMus musculus 
Pontin52 mRNA. 
15791 AF100694 complete cds 



le-28 



118588 



IMus musculus 
|Pontin52 mRNA, 
15801 AF10Q694 complete cds 



le-28 



4056454 



IMus musculus 
Pontin52 mRNA, 
15811 AF100694 complete cds 



le-28 



118588 



IMus musculus 
Pontin52 mRNA, 
1582| AF1 00694 |complete cds 



le-28 



IMus musculus 
Pontin52 mRNA, 
15831 AF100694 complete cds \__le-28 



1 169643 



4056454 



|Mus musculus 
Pontiro2 mRNA, 
1584 j AF10Q694 complete cds 



le-28 



UfcHYDRIN DHN3 



>gi|100035[pirj|iil8139deh y drin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativum] 
(ACOtbWU) Contains repeated 
region with similarity to 
gb)U43627 extensin (atE.xtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana! 



DEHYDRIN DHN3 
>gi|I00035|pir|)S18139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum sativuml 



HMkKAMID£-ixEXATED 

NEUROPEPTIDES 
PRECURSOR >gi|4 16208 
(U03I37) neuropeptide 
precursor FMRFamide-related 
peptide fLymnaea stagnalisl 



118588 



1585 I AF10Q694 



1586| AF100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



4056454 



Mus musculus 
Pontin52 mRNA. 
complete cds 



le-28 



118588 



(ALUUoyyO) Contains* repeated 
region with similarity to 
gb|U43627 extensin (atE.xtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 ! 65 and gb.Z 1 S788 
come from this gene. 
[ Arabidopsis thalianal 



o.o i: 



o.oio 



0.002 



0.002 



DEHYDRIN DHNi 
>gi|l0003.%ir||S18139 dehydrin 
DHN3 - garden pea >gii20709 
(X63063) pea dehydrin DHN3 
[Pisum sativum] 



ontains repeated 
region with similarity to 
gb|U43627 extensin (atE.xtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gbjZ187SS 
come from this gene. 
Arabidopsis thaliana] 



DEHYDRIN DHN3 
>gi|l0OO35jpir||SlS139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
[Pisum saiivuml 



0.002 



0.002 



o.oo: 



0.001 
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Nearest 


Neighbor fBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSIO 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOOiyyO) Contains repeated 




1587 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaiiana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaiiana] 


0.00 1 


1588 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


(ACUOayyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaiiana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaiiana] 


6e-04 


1589 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUOoyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaiiana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaiiana] 


5e-04 


1590 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ACUUDyyoj Contains repeated 
region with similarity to 
gb)U43627 extensin (atExtl) 
gene from Arabidopsis thaiiana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaiiana] 


5e-04 


1591 


API 00694 


VIus musculus 
Pontiri52 mRNA, 
complete cds 


le-28 


118588 


DEHYDRlN DHN3 
>gi|100035|pir||StS139 dehydrin 
DHN.3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 
Pisum sativum] 


2e-04 


1592 


J 

AF100694 t 


VIus musculus 
3 ontin52 mRNA, 
:omplete cds 


le-28 


( 

4056454 


(ACOOoyyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaiiana. 
ESTs gb|Z34165 and gb!Z18788 
'omc from this gene. 
Arabidopsis thaiiana] 


2e-04 


1593 


P 
F 

AF 100694 c 


flus musculus 
'ontin52 mRNA. 
omplete cds 


le-28 


( 
r 

S 
i 
E 
c 

4056454 f 


ACU05990J Contains repeated 
egion with similarity to 
!bjU43627 extensin (atExtl) 
ene from Arabidopsis thaiiana. 
ESTs gb|Z34165 and gbjZ 18783 
ome from this gene. 
Arabidopsis thaiiana] 


5e-05 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redunri.im Prnrpin^ 


SEQ 
ID 


ACCESSION 


( DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AClXbyW) Contains repeated 




1594 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


5e-05 


1595 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


IAI_U039^U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z18788 
come from this gene, 
f Arabidopsis thaliana] 


Ie-05 


1596 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC0O5990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|ZI8788 
come from this gene. 
[Arabidopsis thaliana] 


le-05 


1597 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


4056454 


(aC0G599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34l65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thalianal 


9e-06 




A T"* 1 S\f\ S r\ j 

Ah 100694 


Vfus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


4056454 


(AC00599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 


6e-06 


1599 


1 

AF 100694 c 


vtus musculus 
3 ontin52 mRNA. 
'omplete cds 


le-2S 


J 

< 

I 

c 

4056454 f 


AC005990) Contains repeated 
•egion with similarity to 
?b|U43627 extensin (atExtl) 
»ene from Arabidopsis thaliana. 
5STs gb|Z34l65 and gb|Z18788 
ome from this gene. 
Arabidopsis thaliana] 


5e-06 


1600 


I 
I 

AF 100694 c 


Vlus musculus 
>ontin52 mRNA, 
omplete cds 


le-2S 


I 
F 

544357 F 


WA-BINDING PROTEIN 
"US/TLS protein [human, 
'eptide. 526 aa] (Homo sapiens] 


4e-06 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. N'on- Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AClXJaWO) Contains repeated 




1601 


API 00694 


Mus musculus 
Poniin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


2c-06 


1602 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28. 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-06 


1603 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 13788 
come from this gene. 
[Arabidopsis thaliana] 


9e-07 


1604 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-2S 


4056454 


(ACU05990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z1878S 
come from this gene. 
(Arabidopsis thaliana] 


8e-07 


1605 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


1169643 


FMRFAMIDE-RELATED 
NEUROPEPTIDES 
PRECURSOR >gi|4l620S 
(U03137) neuropeptide 
precursor FMRFamide-related 
peptide [Lvmnaca stasnalis] 


7e-07 


1606 


AF I 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC00599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 165 and 2b|Z187S8 
come from this gene. 
| Arabidopsis thaliana] 


6e-0 _ 


1607 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC00D990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgbjZ34165 and gblZlS7SS 
come frqm this gene. 
[Arabidopsis thalianaj 


5e-0 _ 



e 9 d 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACU03y90) Contains repeated 




1608 


API 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


3e-07 


L609 


AF100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


lc-28- 


4056454 


(ACUOayyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|ZI8788 
come from this gene. 
[Arabidopsis thaliana] 


. le-07 


1610 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


le-28 


4056454 


(ACOOoyyOj Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


le-07 


1611 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


(ACOUDyyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34!65 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


7e-0S 


1612 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005yyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb[Z18788 
come from this gene. 
[Arabidopsis thaliana] 


2e-08 


1613 


AF 1 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


(ACUOayyU) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z13788 
come from this gene. 
[Arabidopsis thaliana] 


6e-09 
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Nearest Neighbor fBlastN vs. Gcnbank) 


1 Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(AC003990; Contains repeated 




16L4 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


5e-09 


1615 


AF 1 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


4e-09 


1616 


AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


te-28 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


7e-10 


1617 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(AC003990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana] 


6e-10 


1618 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


(ALUOayyO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
'Arabidopsis thaliana] 


5e-10 


1619 


AF 1 00694 


VI us musculus 
Pontin52 mRNA, 
complete cds 


le-28 


4056454 


IAC005990) Comains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
ESTsgb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thaliana] 


4e-10 
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Nearest Neighbor ( BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACOldyyU) Contains repeated 




1620 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


le-28 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis chaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
(Arabidopsis thaliana] 


2e-10 


1621 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-28- 


4056454 


(ACW^yyu) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|234165 and gb|Z18788 
come from this gene. 
Arabidopsis thaliana] 


5e-ll 


1622 


AF 100694 


Mus musculus 
Pomin52 mRNA, 
complete cds 


le-28 


4056454 


(AC005990J Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from .Arabidopsis thaliana. 
FSTq ohl7T4lfiT mH ohl7187SS 
come from this gene. 
Arabidopsis thaliana) 


2e-12 


1623 


AF032896 


Peiromyzon marinus 
polyadenylate binding 
protein 


le-28 


1082703 


polyadenylate binding protein II 
human 


2e-27 


1624 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


118588 


DEHYDRIN DHN3 
>gi| ICO035|pirj|S IS 139 dehydrin 
DHN3 - o-irden rvi ioilT)70Q 
(X63063) pea dehydrin DHN3 
Pisum sativuml 


0.013 


1625 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitorl 


6e-04 


1626 


AF 100694 


VI us musculus 
Pontin52 mRNA, 
complete cds 


9e-29 


3876465 < 


(ZS1071) predicted using 
Genefinder; Similarity to 
Human small nuclear 
ribonueleoprotein E cDNA EST 
yk375s7.5 comes from this 
gene; cDNA EST yk435r"5.3 
:omes from this sen... 


9e-06 


1627 


I 

1 

AF 100694 ( 


VIus musculus 
3 ontin52 mRNA, 
"omplete cds 


8e-29 


1 
i 

< 

1 

4056454 [ 


AC003990) Contains repeated 
egion with similarity to 
2b|U43627 extensin (atExtl) 
zene from .Arabidopsis thaliana. 
ESTs gb[Z34!65 and gb;ZlS7SS 
ome from this gene. 
Arabidopsis thaliana] 


2e-06 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












ADP-ftIB0i5YLATlON~ 




1628 


AF 100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


4e-29 


728883 


FACTOR 3 fruit tly (Drosophila 
melanogaster) >gi|507234 
(L25063) ADP ribosylation 
factor 3 [Drosophila 
melanoeaster] 


0.016 


1629 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-29 


544357 


RNA-BINDING PROTEIN 
FUS/TLS protein [human. 
Peptide. 526 aa] [Homo sapiens 


2e-07 


I Oil) 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


4e-29 


4056454 


(ACU0599U) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z18788 
come from this gene. 
[Arabidopsis thaliana] 


Ie-08 


1631 


D43682 


Human mRNA for 
very-long-chain acyl- 
CoA dehydrogenase 
(VLCAD). complete 
cds 


4e-29 


1168287 


ACYL-COA 

DEHYDROGENASE. VERY- 
LONG-CHAIN SPECIFIC 
PRECURSOR (VLCAD) 
dehydrogenase precursor - rat 
Acy 1-CoA dehydrogenase 
Rattus norveaicus] 


6e-37 


1632 


Y07660 


M.tuberculosis accBC 
gene 


4e-29 


2113935 


Z95556) accDl 
[Mycobacterium tuberculosis] 


3e-47 


1633 


X55367 


Human alpha-satellite 
DNA from clone 
pTRA-2. 


le-29 


<NONE> 


<NONE> 


<NONE> 


1634 


L81866 


Homo sapiens 

(subclone 1 f I from 

PI H54) DNA 
sequence 


le-29 


<NONE> 


<NONE> ■ 


<NONE> 


1635 


S75940 


Alu repeats, clone 
52H10} [human, 
colonic mucosa. 
Genomic. 943 ntl 


le-29 


728831 


f" ALU SUBFAMILY J 
WARNING ENTRY 


le-07 


1636 


AB001907 


Homo sapiens 
PACE4 eene. exon 1 3 


le-29 


728831 


ALU SUBFAMILY J 
W.ARN1NG ENTRY 


2e-09 


1637 


AFO770O3 


vtus musculus SH3 
iomain-containing 
idapter protein 
uRNA. complete cds 


5e-30 


<NONE> 


<NONE> 


<NO.VE> 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












(ACUOsyyU) Contains repeated 




1638 


API 00694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


4e-30 


4056454 


region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gbjZ18788 
come from this gene. 
[Arabidopsis thaliana) 


3e-10 


1639 


M27072 


Xcnopus laevis 
po!y(A)-binding 
protein (ABP-EF) 
mRNA. complete cds. 


4e-30 


1352709 


POLY ADEN YLATE- 
BINDING PROTEIN 
polyadenylate-binding protein - 
African clawed frog laevisl 


5e-21 


1640 


X58386 


B.taurus mRNA for 
bovine vacuolar 
ATPase subunit A 


2e-30 


2773154 


(AF039573) abscisic acid- and 
stress-inducible protein 


4.3 


1641 


Y07660 


M. tuberculosis accBC 
gene 


le-30 


2113935 


(Z95556) accDl 
[Mycobacterium tuberculosis | 


4e-47 


1642 


AJ236940 


Sus scrota mRNA for 
hypothetical protein 
(5': clone 7C4) 


4e-31 


4102021 


(AF00756I) delta 6-desaturase 
[Boraeo officinalis) 


7.4 


1643 


AF039400 


Homo sapiens 
calcium-dependent 
chloride channel- 1 
(hCLCAl) mRNA. 
complete cds 


2e-31 


3721912 


(AB017156) gob-5 [Mus 
musculus) 


7e-08 


1644 


L77036 


Homo sapiens 
(subclone 5_d9 from 
PI H19) DNA 
sequence. 


le-31 


461663 


BOMBYXIN B-2 HOMOLOG 
PRECURSOR silkmoth 
>gi|217385|gnl|PlD|d 1003528 
(D13924) Samia bombyxin 
homoloa B-2 [Samia cynthia) 


l.l 


1645 


X61971 


H.sapiens mRNA for 
macropain subunit 
delta 


le-31 


296734 


(X61971) macropain subunit 
delta [Homo sapiens] 


3e-06 


1646 


L00016 


human mitochondrial 
trnas and partial 
proteins 4 & 5; 
histidyl-. seryl-. 
leucyl-tma genes: 
urf4 and urf5 
(partial). 


5e-32 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTsgb|Z34165 and gb|Z187S8 
come from this gene. 
[Arabidopsis thaliana] 


0.002 


1647 


M178S7 


Human acidic 
ribosomal 
phosphoprotein P2 
mRNA. complete cds. 


5e-32 


4056454 


(AC005990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 13788 
come from this gene. 
(Arabidopsis thaliana] 


le-05 



7^3 
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Nearest Neiehbor (BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins') 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 




P VALUE 






Human mitogen- 










L659 


US 3446 


responsive 

phosphoprotein DOC- 
2 mRNA. complete 
cds. 


6e-34 


3395443 


(AC004683) putative 
ammonium transporter. 3' partial 


4.7 


1660 


AFO 13988 


Homo sapiens serine 
protease mRNA, 
complete cds 


4e-34 


■ 2507226 


PROTEIN-TYROSLNE 
PHOSPHATASE EPSELON 
PRECURSOR (R-PTP- 
EPSILON) >gi| 1439605 
(U62387) protein tyrosine 
phosphatase-e [Mus musculusl 


3.2 


1661 


U53446 


Human mitogen- 
responsive 

phosphoprotein DOC- 
2 mRNA, complete 
cds. 


2e-34 


104757 


LEP100 protein precursor - 
chicken >ai|2 12254 aallus] 


1.6 


1662 


AJ233632 


Homo sapiens 
endogenous retroviral 
sequence ERV-L pol 
gene, clone ERV-L 
Human6 


2e-34 


3860513 


(AJ233597) reverse 
transcriptase [Mus famulus] 


4e-10 


1663 


AF086310 


Homo sapiens full 
length insert cDNA 
clone ZD51F08 


8e-35 


2947070 


(AC002521) putative Ser/Thr 
protein kinase [Arabidopsis 
thaliana! 


2.3 


1664 


XI 7206 


Human mRNA for 
LLRepj 


3e-35 


730652 


40S RIBOSOMAL PKOlfelN 
S2 (STRINGS OF PEARLS 
PROTEIN) 

>gi|t0S5l58|pir||S50325 
ribosomal protein S2 - fruit fly 
(Drosophila melanogaster) 
melanogaster) >gip 139 / - 
(U01335) ribosomal protein S2 


2e-10 j 


1665 


ABOU137 


Homo sapiens mRNA 
tor KIAA0565 
protein, complete cds 


3e-35 


3043654 


(AB0I 1137) KIAA0565 protein 
[Homo sapiens] 


2c- 16 


1666 


U62801 


Human protease M 
mRNA. complete cds 


2e-35 


392923 1 


(AF091247) potassium channel 
[Rattus norvesicusl 


1.0 


1667 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA. complete cds 


le-35 


2738915 


(AF020760) serine protease 
[Homo sapiens! 


9e-l4 
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Nearest Neiahbor (BlastN vs. Genbank) 


. Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1668 


Z93943 


sequence trom 
cosmid U235H3 on 
chromosome X 


8e-36 


1196432 


(M22333) unknown protein 
[Homo sapiens] 


3c- 10 


L 007 


X06778 


Rabbit 18S rRNA 


7e-36 


118588 


DEHYDRIN DHN3 
>gi|100035|pir||S 18139 dehydrin 
DHN3 - garden pea >gi|20709 
(AOJUCO^ pea ucnyonn uniN j 
[Ptsum sativuml 


0.011 


1670 


AB007962 


Homo sapiens 
mRNA. chromosome 
1 specific transciipt 
KIAA0493 


3e-36 


3329243 


(AE001350) hypothetical 
protein [Chlamydia trachomatis] 


J. 1 


1671 


Z81014 


Human DNA 
sequence from 
cosmid U65A4, 
between markers 
DXS366 and DXS87 
on chromosome X * 


3e-36 


141103 


HYPOTHETICAL PROTEIN 
ORF-1137 mouse 


0.038 




Z81014 


Human LJiN.-\ 
sequence from 
cosmid U65A4, 
between markers 
DXSj66 and DX5S/ 
on chromosome X * 


3e-36 


198651 


( IV] _ y j j ) U K_t 1 [ivius 
musculus] 


0.006 


1673 


U49082 


Human transporter 
protein (gl7) mRNA. 
complete cds 


3e-36 


1840045 


(U49082) transporter protein 
[Homo sapiens] 


2e-15 


1674 


J03133 


Human transcription 
factor SP1 mRNA. 3" 
end. 


3e-36 


477133 


HF-l regulatory element binding 
protein - rat 


2e-31 


1675 


AB007934 


for KIAA0465 
protein, partial cds 


le-36 


3413892 


(AB007934) KIAA0465 protein 
[Homo sapiens] 


4e-37 


1676 


M34857 


Mouse Ho.\-2.5 
mRNA. 


9e-37 


106296 


homeotic protein Hox B9 - 
human (fragment) 


0.15 


1677 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from PI 35 H5 C8) 
DNA sequence. 


9e-37 


2072960 


(U9356S) p40 [Homo sapiens] 


3e-05 


1678 


X80240 


H.sapiens 
endogenous 
retrovirus HERV- 
KC4 DNA 


Se-37 


4185944 


(Y17S33) env protein [Human 
endogenous retrovirus Kl 


le-!5 
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Neni-Mt Neiohhnr fRI.islN vs. Gcnbank) 


Nearest Neiahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1679 


Z93943 


sequence from 
cosmid U235H3 on 
chromosome X 


9e-38 


106322 


hypothetical protein (L1H 3' 
reeion) - human 


4e-13 


1680 


X97303 


Rsapiens mRNA for 
Ptg-12 protein 


4e-38 


466044 


hVmii HI- 1 U AL Z1NL 
FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi|630780|pir||S44909 ZK686.4 
protein - nenornaooiua cicgaiio 
>gi|304346 (L17337) coded for 
by C. elegans cDNAs 
GenBank:M88869 and TO 1933; 
putative ICaenorhabditis 
eleeansl 


3e-37 


1681 


Y08999 


Rsapiens mRNA for 
Sop2p-like protein 


3e-38 


3334339 


SOP2-LIKE PROTEIN 


5e-Oo 


1682 


Z62887 


Rsapiens CpG DNA, 
clone 74g6. forward 
read cp°74a6.ftla . 


2e-38 


1245636 


(U53181) F36D4.2 gene 
product [Caenorhabditis 
elegans) 


0.19 


1683 


U35032 


Human endogenous 
retrovirus clone 
c5.11, HERV'-H 
multiply spliced 
subgenomic leader, 
protease and integrase 
region mRNA. partial 
cds 


le-38 


59977 


(Z 143 10) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus] 


le-06 


1684 


D86974 


Human mRNA tor 
KIAA0220 gene, 
partial cds 


le-38 


3337386 


(AC002544) Unknown gene 
product splice form-2 [Homo 
sapiensl 


oC- 1 i 


1685 


M31013 


Human nonmuscle 
myosin heavy chain 
(NMHC) mRNA. j 
end. 


le-38 


4115748 


(AB022023) nonmuscle myosin 
heavv chain B 


2e-ll 


1686 


AFOO6087 


Homo sapiens Arp2/3 
protein complex 
subunit p20-Arc 
(ARC20) mRNA. 
complete cds 


4e-39 


<NONE> 


<NONE> 


<NONE> 


1687 


X58374 


D.melanogaster crn 
mRNA 


4e-39 


2655888 


(AL009171) 62D9.a 
[Drosophila melanogasterl 


4e-42 


16S8 


DS5815 


Human DNA for 
rhoHPl. complete cds 


le-39 


134080 


GTP-BINDING PROTEIN 
TC10 ras-like protein [Homo 
sapiensl 


3e-26 



r 
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Nearest Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 


ACCESSION 


, DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1689 


U49057 


Rattus norvegicus 
CTD-binding SR-like 
protein rA9 mRNA, 
complete cds 


4e-40 


1438534 


(U49057) rA9 (Rattus 
norvegicus] 


5e-05 


1690 


Y08999 


H.sapiens mRNA for 
Sop2p-like protein 


4e-40 


3334339 


SOP2-LIKE PROTEIN 


9e-08 


1691 


AB002293 


Human mRNA for 
KIAA0295 gene, 
partial cds 


4e-40 


2224531 


(AB002293) KIAA0295 (Homo 
sapiens] 


le-30 


1692 


AF086222 


Homo sapiens full 
length insert cDNA 
clone ZC66E08 


le-40 


2829669 


DOUBLE- STRANDED RNA- 
SPECEFIC ED1TASE 1 
(DSRNA ADENOSINE 
DEAMINASE) (RNA 
EDITING ENZYME 1) 
>gi|l707502|gni|PID|e254627 
(X99227) double-stranded RNA- 
specific editase [Homo sapiens] 
editase 1 hREDl-L [Homo 
sapiens] >gi|2039300 (U76421) 
dsRNA adenosine deaminase 
DRADA2b [Homo sapiens] 


0.61 


1693 


AF044127 


Homo sapiens 
peroxisomal short- 
chain alcohol 
dehydrogenase 
(SCAD-SRL) mRNA, 
complete cds 


le-40 


4105190 


(AF044127) peroxisomal short- 
chain alcohol dehvdrogenase 


2e-06 


1694 


U36778 


Mus musculus Sil 
mRNA, complete cds 


le-40 


88608 


SIL protein - human >gi|33S08S 
(M74558) SIL 


6e-23 


1695 


U36778 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M7455S) SIL 


6e-23 


1696 


U3677S 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M74558) SIL 


5e-23 


1697 


U36778 


Mus musculus Sil 
mRNA. complete cds 


le-40 


88608 


SIL protein - human >gi|3380SS 
(M7455S) SIL 


5e-23 


1698 


ABO 1 8285 


Homo sapiens mRNA 
for KIAA0742 
protein, partial cds | 


le-40 


3882205 


(AB01S2S5') KIAA0742 protein 
Homo sapiens] 


6e-3 I 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-ReriunHnnr Pmmin^ 


SEC. 
ID 


ACCESSIOI> 


Ut^CKir I ION 


P VALUE 


ACCESSION - 


DESCRIPTION 


P VALUE 












ATP-BINDING CASSETTE" " 




1699 


X75927 


M.musculus abc2 
mRNA 


le-40 


728773 


TRANSPORTER 1 ABC1 - 
human >gi|495257 (X75926) 
abcl [Mus musculusl 


3e-37 


1700 


AF038200 


Homo sapiens clone 
23954 mRNA 
sequence 


5e-41 


3211975 


(AF068195) putative 
glialblastoma cell differentiation 
related protein [Homo sapiens] 


■ 

5e-14 


1701 


U20521 


Human estrogen 
sulfotransferase 
(STE) gene, exon 8 
and complete cds 


4e-41 


• <NONE> 


<NONE> 


<NONE> 


1702 


AF026548 


Homo sapiens 
branched chain alpha- 
ketoacid 

dehydrogenase kinase 
precursor, mRNA, 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-41 


3182923 


[3-METHYL-2- 
OXOBUTANOATE 
DEHYDROGENASE 
(LIPOAMIDE)] KINASE 
PRECURSOR alpha-ketoacid 
dehydrogenase kinase precursor 
Homo sapiens! 


2e-09 


1703 


Y07660 


vl. tuberculosis accBC 
zene 


2e-41 


465847 


HYPOTHETICAL 66.5 KD 
PROTEIN F02A9.5 IN 
CHROMOSOME III 
>gi|2S0542|pir||S28313 
hypothetical protein F02A9.5 - 
Caenorhabditis elegans 
Genefinder: similar to Propionyl 
CoA carbo.Nylase beta chain; 
cDNA EST EMBL:M8901S 
comes from this gene; cDNA 
EST EMBL:D28069 comes 
rom this gene; cDNA EST 
EMBL:D2S068 comes from this 
zene; cDNA EST ... 


3e-38 


1704 


i 

AC001237 ! 


-lomo sapiens 
genomic DNA, 21q 
egion. clone: 
)H11N46 


le-41 


1 

106322 t 


lypothetical protein (L1H 3' 
egion) - human 


5e-09 


1705 


1 
f 

AB007934 t 


-lomo sapiens mRNA 
or KIAA0465 
irotein, partial cds 


le-41 


( 

3413892 


AB0079341 KIAA0465 protein 
Homo sapiens] 


3e-12 


1706 


I 

; 

AF055029 s 


lomo sapiens clone 
47 1 1 mRNA 
equence 


5e-42 


3250681 f 


AL0244S6"' putative protein 


2.2 
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Nearest Neiehbor fBlastN vs. Genbank) 


Nearest Netahbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












1- 




1707 


Z49747 


O.cuniculus raRNA 
for phospholipase C 


5c-42 


130227 


FHOSPHAIWYLINOSITOL- 
4,5-BISPHOSPHATE 
PHOSPHODIESTERASE 
DELTA 1 (PLC-DELTA- 1) 
(PHOSPHOLIPASE C-DELTA- 
1) (PLC-III) >gi|163538 
(M20638) phospholipase C-III 
(Bos taurus] 


5e-36 


1708 


M93651 


Human set gene, 
complete cds. 


2e-42 


<NONE> . 


<NONE> 


<NONE> 


L709 


AJ236940 


Sus scrofa mRNA for 
hypothetical protein 
(5': clone 7C4) 


2e-42 


2062403 


(U79010) delta 6 desaturase 
[Boraao officinalis) 


8.5 


1710 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-42 


1708436 


INHIBIN BETA A CHAIN 
PRECURSOR 


2e-10 


1711 


AJ223777 


Mus musculus mRNA 
for striatin 


6e-43 


2494917 


STRIATIN 

>si|1495773l2nNPID|e254158 


2e-32 


1712 


AFO 16411 


Homo sapiens 
potassium channel 
subunit KCNA3.1B 


2c-43 


2708i 14 


(AF016411) KCNA3.1B [Homo 
sapiens] 


3e-13 


1713 


AC001443 


Homo sapiens 
(subclone 2_fl0 from 
BAC 2913 


le-43 


111814 


hypothetical protein 3 - rat 
>ai|565S9 


2e-06 


1714 


XS2895 


H.sapiens mRNA for 
DLG2 


6e-44 


2497511 


MAGUK P55 SUBFAMILY 
MEMBER 2 uMPP2 PROTEIN) 
(DISCS. LARGE HOMOLOG 
2) 


6e-52 | 


1715 


U 17077 


Human BENE 
mRNA. partial cds. 


3e-44 


53912 


(X57960) ribosomal protein L7 
[Mus musculusl >ei|55489 


8e-30 


1716 


AJ222700 


Homo sapiens mRNA 
forTSC-22 protein 


2e-44 


<NONE> 


<NONE> 


<NONE> 


1717 


J03634 


Human erythroid 
differentiation protein 
mRNA 


2e-44 


124279 


LNH1B1N Bh 1 A A CHAIN 
PRECURSOR PROTEIN) 
(EDF) >gi|S7936|pir||B2424S 
inhibin beta-A chain precursor - 
human >gi|lS1947 ^03634) 
erythroid differentiation protein 
precursor [Homo sapiens] 
sapiens! 

>gi|22oS50|prfj|l60S260B 
nhibin betaA [Homo sapiens] 


0.73 
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[ Nearest 


Neighbor (BlastN vs. Genbank] 


Nearest Neighbor (BlasrX vs. Non-Redundanr Prnn>j n <) 


SEQ 
ID 




I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1718 


ABO 145 18 


Homo sapiens mRNA 
for KIAA0618 
protein, complete cds 


V 

7e-45 


1911548 


(S80864) cytochrome c-like 
polypeptide sapiens] 


1.6 


1719 


X76808 


H.sapiens genomic 
DNA clone d2 


7e-45 


868201 


(U29380) similar to adenylate 
cyclase [Caenorhabditis elesans 


2e-09 


• 1720 


AB021288 


Homo sapiens mRNA 
for beta 2- 
microglobulin, 
complete cds 


2e-45 


2465521 


(U95995) RNA-dependent RNA 
polymerase [Cryptosporidium 
parvum] 


0.15 


1721 


X63468 


H.sapiens mRNA for 
transcription factor 
TFIIE alpha 


8e-46 


<NONE> 


<NONE> 


<NONE> 


1722 


AFO 19226 


Homo sapiens D2-2 
mRNA. 3 - UTR 


7e-46 


<NONE> 


<NONE> 


<NONE> 


1723 


D31764 


Human mRNA for 
KIAA0064 gene, 
complete cds 


2e-46 


3123050 


HYPOTHETICAL PROTEIN 
KIAA0064 


le-15 


1724 


K02774 


Human MHC class II 
HLA-DR-beta-psi 
(DW4/DR4) 
pseudogene. exons 
3.4, 5,6, clones cosll- 
3301 and cosII-801. 


le-46 


4185946 


(Y 17834) gag protein [Human 
endogenous retrovirus Kl 


2e-I4 


1725 


X92109 


H.sapiens hcsIX .aene 


9e-47 


2498185 


BRIl3e OP SEVgNLESS 
PROTEIN PRECURSOR 
>si| 1079 1 66jpir||A47550 bride 
of sevenless precursor - fruit fly 
(Drosophila virilis) >gi|290216 
virilisl 


1.4 


1726 




rl .sapiens 

mitochondrial DNA, 
:omplete genome 


8e-47 


128753 


NADH- UBIQUINONE 
OXIDOREDUCTASE CHAIN 
4 >gi|S6696|pir||A00435 NADH 
Jehvdroeenase ( ubiquinone) 


4c- 15 


1727 


M85145 


■luman tumor 
necrosis factor 
eceptor. 3' flank. 


3e-47 


<NONE> 


<NONE> 


<NONE> 


1728 


< 
r 

X80240 I 


.sapiens 
mdogenous 
etrovirus HERV- 
CC4 DNA 


3e-47 


( 

4185944 e 


Y17S33) cnv protein [Human 
ndogenous retrovirus K] 


7e-lS 


1729 


I 

c 

Z63594 r 


Lsapiens CpG DNA. 
lone 87f9. forward 
ead cpaS7r'9 ft la . 


le-47 


( 

3322743 p 


AE001222) T. pallidum 
redicted coding reaion TP0454 


2.4 



300 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pmrnino 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






R.ractus mRNA for 










1730 


X62295 


vascular type- 1 
angiotensin II 
receptor 


4e-48 


1209756 


(U43629J integral membrane 

Drofein fRf»m vuloiricl 


le-u/ 


1731 


M85145 


Human tumor 
necrosis factor 
receptor, 3' flank. 


3e-48 


<NONE> 


<NONE> 


<NONE> 


1732 


AB020712 


Homo sapiens mRNA 
for KIAA0905 
protein, complete cds 


4e-49 


4240299 


(AB020712) KIAA0905 protein 
[Homo sapiens] 


2e-20 


1733 


AB0207I2 


Homo sapiens mRNA 
for KIAA0905 
protein, complete cds 


3e-49 


4240299 


(AB020712) KJAA0905 protein 
[Homo sapiens! 


2e-':0 


1734 


X62295 


R.rattus mRNA tor 
vascular type- 1 
angiotensin II 
receptor 


le-49 


1209756 


rU436"^9^ inteorat mpmhmni» 
protein [Beta vulgaris] 


7e-12 


1735 


AJ007509 


Homo sapiens mRNA 
forElB-55kDa- 
associated protein 


le-49 


3319956 


associated protein 


i 
i 

4e-24 


1736 


X97303 


H.sapiens mRNA for 
Ptg-12 protein 


le-49 


466044 


H V PU I HE 1. 1LAL ZINC 
FINGER PROTEIN ZK686.4 
IN CHROMOSOME III 
>gi|6307S0|pir||S44909 ZK686.4 
protein - Caenorhabdicis elegans 
>gi|304346 tL17337) coded for 
by C. elegans cDNAs 
GenBank:MSS869 and T01933; 
putative [Caenorhabditis 
elegans] 


| 

j 

8e-3'. i 


1737 


AF038404 


Homo sapiens 
lomolog of Nedd5 
(hNeddi) mRNA, 
complete cds 


4e-50 


<NONE> 


<NONE> 


<NON 


1738 


< 

L43618 < 


-lomo sapiens 
aolycystic kidney 
disease (PKDl)gene, 
:xons 35-42 


4e-50 


90375S 


L43619) polycystic kidney 
disease 1 protein [Homo 
sapiens] 


3e-' • 


1739 


AF009424 c 


-iomo sapiens clone 
12 mRNA, alternative 
plice variant alpha- 1, 
•omolete cds 


4e-50 


( 

2271473 ! 


AF0094;6) clone 22 [Homo 
a£ien»l 


5e 



'bol 
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ruffes 


Nearest Neiehbor CBlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 






D VAT JVC 


At-t-caoION 


UfciCKUr I ION 


P VALUE 












monosacchand transport protein 




1740 


L77040 


Homo sapiens 
(subclone 8_cll frorx 
PI H22) DNA 
sequence. 


t 

2e-50 


99758 


STK4 - Arabidopsis thaliana 
>gi|l6524 (X66857) sugar 
transport protein [Arabidopsis 
thaliana] 


6.4 


1741 


L35657 


Homo sapiens 
(subclone H8 5_al0 
from PI 35H5CS) 
DNA sequence. 


2e-50 


2072960 


(U93568) p40 [Homo sapiens! 


2e-05 


L742 


U80745 


Homo sapiens CTG7a 
mRNA. partial cds 


le-50 


<NONE> 


<NONE> 


<NONE> 


1743 


D84514 


Bovine mRNA for 
p97. partial cds 


le-50 


3978527 


(AF103728) structural 
polvprotein [Sindbis virusl 


9.9 


1744 


M22960 


Human protective 
protein mRNA. 
complete cds. 


le-50 


131081 


LYSOSOMAL PROTECTIVE 
PROTEIN PRECURSOR 
(CATHEPSIN A) 
(CARBOXYPEPTIDASE C) 
human >gi| 190283 uM22960) 
protective protein precursor 


le-12 


1745 


X86018 


H.sapiens mRNA For 
MUF1 protein 


le-50 


1082610 


mut'l protein - human 
>gi|762953 (X8601S) mufl 
Homo sapiens] 


le-21 


1746 


U03495 


Human transcription 
factor LSF-ID 
mRNA. complete cds. 


7e-5l 


2136296 


transcription factor LSF - human 
>gi|476099 


le-21 


1747 


AB015344 


Homo sapiens 
HRIHFB2I57 
mRNA. partial cds 


5e-5I 


3970874 


(AB015344) HR1HFB2157 
[Homo sapiens] 


2e-35 


1748 


iviy j j _>v 


Human zinc finger 
protein mRNA. 




JU24 1 10 


MYC-ASSOCIATED ZINC 
FINGER PROTEIN sapiens] 


2e-06 


1749 


U71363 


Human zinc finger 
protein zfp6 (2F6) 
mRNA. partial cds 


4e-51 


2689441 


(AC0O36S2) F1S547_1 [Homo 
sapiens] 


2e-ll 


1750 


X56932 


H.sapiens mRNA for 
23 kD highly basic 
protein 


4e-5l 


( 

730451 ' ( 


bOS RIBOSOMAL.PROTHIN 
L13A (23 KD HIGHLY BASIC 
PROTEIN) 

>gi|345S97|pir!;S29539 basic 
Drotein. 23K - human >gi|2369 1 
X56932) 23 kD highly basic 
jrotein [Homo sapiens] 


le-U 


1751 


Z79054 


-(.sapiens flow-sorted 
chromosome 6 
Hindlll fragment. 
SC6pA21Ell 


2e-51 


<NONE> 


<NONE> 


<N0NE> 
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Nearest Neiahbor (BlastN vs. Genbank) 


i Nearest Neighbor (BlascX vs. Non-Redundant Proteins) 


SEQ 
ED 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1752 


AF068245 


BAF6O0 gene, partial 
sequence 


5e-52 


<NONE> 


<NONE> 


<NONE> 


1753 


AJ236932 


Sus scrofa mRNA for 
hypothetical protein 
(5"; clone 4B8) 


5e-52 


400927 


RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
[Drosophila melanogaster] 


4.7 


1754 


AF003693 


Mus musculus 
scaffold protein Pbpl 
homolog mRNA, 
complete cds 


6e-53 


2197106 


(AF003693) scaffold protein 
Pbpl homolos [Mus musculus] 


2e-54 


1755 


MZ/J19 


Human calmodulin 
mRNA. complete cds. 


5e-53 


115528 


CALMODULIN 
>gi|l02408|pir||JC1309 
calmodulin - Stylonychia lemnae 
(SGC5) >gi|l61195 


0.002 


1756 


M74555 


Mouse house-keeping 
protein mRNA, 
complete cds. 


5e-53 


284775 


house-keeping protein - mouse 
>gi|193S71 


5e-30 


1757 


X92720 


H.sapiens mRNA for 
phosphoenolpyruvate 
carboxvkinase 


6e-54 


2135915 


phosphoenolpyruvate 
carboxvkinase (GTP) (EC 
4. 1. 1.32) precursor, 
mitochondrial - human 
carboxvkinase (GTP) [Homo 
sapiens] 


6e-21 


1758 


AF007872 


Homo sapiens torsinB 
(Utji) mKJNA, partial 
cds 


2e-54 


2760121 


(AB002405) LAK-4p [Homo 
sapiens 1 


0.27 


1759 




Mus musculus 
B6CBA Lisch7 
mRNA. partial cds. 


2e-54 


1236083 


(U49507) Lisch7 [Mus 
musculus] 


3e-27 


1760 


Z73360 


Human DNA 
sequence from 
cosmid 92M18. 
BRCA2 gene resjion 
chromosome I3ql2- 
13. 


Ie-55 


2370371 


(Y 14657) hydrophobin 
Pleurotus ostreatus] 

^gl|_70— U_\J|gni|r LU\C L.OJ70D 

'AJ22506I) POH2 hydrophobin 
Pleurotus ostreatus] 


2.0 


1761 


( 

U83702 ( 


-luman cytochrome c 
Dxidase subunit Via 
»ene. exon 3 and 
romplete cds 


8e-56 


( 

2982994 f 


AE0006S2) hypothetical 
srotein [Aquifex aeolicusl 


7.0 


1762 


I 

1 

Y12781 1 


-lomo sapiens mRNA 
"or transducin (beta) 
ike 1 protein 


7e-56 


( 

3021409 


Y127S1) transducin (beta) like 
protein [Homo sapiensl 


7e-?9 
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SEC 
ID 


Nearest 

I 

ACCESSIOr 


Neishbor (BlastN vs. 
i DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


roteins) 
P VALUE 


176! 


I AB020673 


Homo sapiens mRN^ 

fnr IfTA AD9AA 

protein, complete cds 


8e-57 


2104553 


(AF001548) Myosin heavy 
chain (MHYI 1) (5'panial) 
[Homo sapiens] 


4e-04 


1764 


I AJ236932 


hypothetical protein 
(5*; clone 4B8) 


3e-57 


400927 


RIBONUCLEOPROTEIN 
RB97D ribonucleoprotein 
[Drosophila melanogaster] 


4.7 1 


1765 


I L06900 


Hnmnn Hvcrronhin 

gene, intron 1 
containing pseudo 
exon. 


le-58 


• 4185129 


(AC005724) unknown protein 
[Arabidopsis thaliana] thaliana] 


7.0 1 


1766 


I X93334 


n. sapiens 

mitochondrial DNA. 
complete senome 


9e-59 


1492050 


(U60315) MC107L [Molluscum 
contasiosum virus subtype 11 


1 0.17 I 


1767 


AF064856 


Rattus sp. 7acomp 
protein mRNA, 
complete cds 


3e-59 


3169626 


(AF064856) 7acomp protein 
[Rattus sp. ) 


2e-3 1 


1768 


AF081484 


Homo sapiens alpha- 
tubulin isoform 1 
mRNA. complete cds 


2e-59 


32015 


(X06956) alpha-tubulin [Homo 
sapiens] 


4e-' , '> 1 


1769 


X71427 


Homo sapiens mRNA 
ror rUo-CrlUr 
protein fusion 


le-60 


746557 


(U23523) histidine-rich 
[Caenorhabditis elesans] 


0.45 1 


1770 


AFO 13988 


Homo sapiens serine 
protease mRNA. 
complete cds 


le-60 


2564316 


(AB006622) No similarities to 
any reported proteins [Homo 
sapiensl | 


026 


1771 1 


yj - Joy i 


LVIU3 II11J5CUIUS 

ymphocyte specific 
lelicase mRNA. 
complete cds 


7e-61 


2137490 


ymphocyte specific helicase - 
mouse musculus] | 


3e-">5 I 


1772 I 


I 

r 

X93334 c 


-I. sapiens 

nitochondrial DNA, 
omplete genome 


4e-61 


1 

70656 i 


ibiquitin / ribosoma! protein 1 
527a - human extension protein, 1 

•fUBCFPSO rhumin Pentirlf 
56 aa] ubiquitin extention 
irotein [Cavia porcellus] 


9e-0S 1 


1773 


1 
t 

D3S255 c 


iomo sapiens mRNA 
or CAB I. complete 
ds 


4e-61 


2135214 s 


ene VfLN 64 protein - human | 


4e-23 1 


1774 1 


^ 
1 

h 

U25691 c 


/lus musculus 
ymphocyte specific 
elicase mRNA. 
omplete cds 


8e-62 


1 

2137490 n 


ymphocyte specific helicase - 
louse musculus] | 


Se-26 1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1775 


M21731 


Human lipocortin-V 
mRNA. complete cds. 


6e-62 


3212603 


Human Annexin V With Proline 
Substitution By Thioproline 


2c-20 


1776 


AF021936 


Rattus norvegicus 
myotonic dystrophy 
kinase-related Cdc42- 
binding kinase 
MRCK-beta (MRCK- 
beta) mRNA, 
complete cds 


2e-62 


2736153 


(AF021936) myotonic 
dystrophy kinase-related Cdc42- 
binding kinase MRCK-beta 
{Rattus norvegicus] 


3e-27 


I / / / 


Y 12059 


H.sapiens HUNKI 
mRNA 


le-62 


3184498 


(AC004798) R31546_l [Homo 
sapiens] 


3e-09 


1778 


L37368 


Human (clone E5-1) 
RNA-binding protein 
mRNA. complete cds. 


6e-63 


477578 


sialidase - Actinomyces viscosus 
>ai|141852 


7.8 


1 ~71C\ 


M27877 


Figure 1. Nucleotide 
and translated protein 
sequences of HPF1, - 
2. and -9. 


5e-63 


1731443 


ZINC FINGER PROTEIN 83 
(ZINC FINGER PROTEIN 
HPF1) >gi|106023|pir||A32891 
finger protein I, placental - 
human 


3e-33 


1780 


AF095448 


Homo sapiens 
putative G protein- 
coupled receptor 


2e-63 


3116131 


(AL02328S) hypothetical 
protein 


4.6 


1781 


L 19437 


Human transaldolase 
mRNA containing 
transposable element, 
complete cds 


2e-63 


1553119 


(Uojl39) transaldolase [Mus 
musculus] 


4e-18 


1782 


L41351 


Homo sapiens 
prostasin mRNA, 
complete cds 


le-63 


2833277 


PROSTASIN PRECURSOR 
precursor - human >gi|862305 
(L41351) prostasin [Homo 
sapiens] >gi|l 143194 (U33446) 
prostasin [Homo sapiens] 


6e-14 


1783 


AF053470 


Homo sapiens lOkD 
protein (BC10) 
mRNA. complete cds 


6e-64 


482237 


hypothetical protein K03H1.9 - 
Caenorhabditis eleaans 


0.029 
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Nearest Neighbor 'BlastN vs. Genbank) 


Nearest Neighbor fBlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















170.1 
1 10'+ 


D37791 


Mouse mRNA for 
beta- 1,4- 

galactosy [transferase 


6e-64 


3880102 


■ U9j390; similar m b i ve zinc 

finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
EST yk359g9.5 comes from this 
gene; cDNA EST yk3 19c2.5 
comes from this gene 
[Caenorhabditis elegans] zinc 
finger; cDNA EST yk265b4.5 
comes from this gene; cDNA 
EST yk359g9.5 comes from this 
gene; cDNA EST yk319c2.5 
comes from this gene 
[Caenorhabditis elegans] 


3e-16 


1785 


AFO 15770 


Mus musculus radical 
fringe (radical- fringe) 
mRNA. complete cds 


6e-64 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


le-36 


1786 


Z79054 


H.sapiens flow-sorted 
chromosome 6 
Hindlll fragment. 
SC6pA21EU 


2e-64 


<NONE> 


<NONE> 


<NONE> 


1787 


M83094 


Homo sapiens 
cytosolic selenium- 
dependent glutathione 
peroxidase gene, 
complete cds. and 
rhoh 12 aene. 3' end. 


le-64 


2447063 


(U42580) A565R [Paramecium 
bursaria Chlorella virus 1 [ 


8.8 


1788 


Y10211 


H.sapiens LAG-3 
gene, promoter region 


7e-65 


1944540 


(XI41 12) tegument protein 
'human herpesvirus 1] 


2.3 


1789 


Ml 9045 . 


Human lysozyme 
mRNA. complete cds. 


2e-65 


<NONE> 


<NONE> 


<NONE> 


1790 


U01882 


Homo sapiens SS- 
A/Ro autoantigen 52 
<da component gene, 
complete cds 


2e-65 


585401 


LIP Abb MUDULA 1 OK 
PRECURSOR (LIPASE 
HELPER PROTEIN) . 
>gi|480045|pir||S36249 lipB 
srotein - Pseudomonas glumae 
>gi|49207 (X70354) helper 
arotein 


4.2 


1791 


AF069517 


-lomo sapiens RNA 
jinding protein DEF- 
3 mRNA. complete 
:ds 


2e-65 


3212101 


AF0695I7) RNA binding 
srotein DEF-3 [Homo sapiens] 


le-25 
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Nearest 


Neighbor 'BlastN vs. ( 


Den bank) 


Nearest Neighbor (BlastX vs. Non-Redundant Pmipin^ 


SEC 
ID 


ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens jerky 










1792 


AF004715 


gene product 
homolog raRNA. 
complete cds 


2e-65 




(AF0047 15) jerky gene product 
homolog [Homo sapiensl 


2e-45 


1793 


X59652 


C. longicaudacus hprt 
mRNA for 
hvpoxanthine 


3e-66 


631625 


hypoxanthine (guanine) 
phosphoribosyltransferase - long 
tailed hamster 
phosphoribosyltransferase 
[Cricetulus longicaudatus] 


6e-54 


1794 


U94350 


Mus musculus radical 
fringe precursor 
mRNA. complete cds 


3e-67 


2204355 


(U94350) radical fringe 
precursor [Mus musculus] 


2e-33 


1795 


AF01581 1 


Mus musculus 
putative 

lysophosphatidic acid 
acyltransferase 
mRNA. complete cds 


3e-68 


2317725 


(AF01581 1) putative 
lysophosphatidic acid 
acyltransferase [Mus musculus] 


7e-5I 


1796 


J03137 


Cow 

phosphoinositide- 
specific 

phospholipase C 


3e-69 


226908 


phospholipase C 154 [Bos 
taurus] 


3e-25 


1797 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2.4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA. 
complete cds 


le-69 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norvegicus] 


2e-33 


1798 


AF0I5S11 


Mus musculus 
putative 

lysophosphatidic acid 
acyltransferase 
mRNA. complete cds 


4e-70 


2317725 


(AF015811) putative 
ysophosphatidic acid 
acyltransferase [Mus musculus) 


3e-19 


1799 


X65157 


vl. musculus mRNA 
or desmoyokin. 
oartial 


5e-74 


( 

109781 ; 


desmoyokin - mouse (fragment) 
>gi|50675 


9e-37 


1800 


I 

Z97207 | 


VIus musculus mRNA 
"or B-IND1 protein 


2e-74 


( 

2231019 r 


Z97207) B-INDI protein [Mus 
nusculusj 


6e-21 


1801 


( 
t 

U27196 |r 


jallus gallus zinc 
"ingcr protein (Fzf-1) 
nRNA. complete cds. 


6e-75 


( 

984814 f 


U27 196) zinc finger protein 
Gallus "alius] eallus] 


2e-44 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteino 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












70 KD WD- REPEAT TUMOR 




1802 


Y 15054 


Rartus norvegicus 
mRNA for 70 kDa 
lumor specific 
antigen, partial 


3e-77 


3123027 


SPECIFIC ANTIGEN 
>gi|2505957|gnl|PID|e353992 
(Y 15054) 70 kD tumor-specific 
antisen [Rattus norvegicus] 


4c-42 


1803 


X65157 


M.musculus mRNA 
for desmoyokin. 
partial 


3e-79 


109781 


desmoyokin - mouse (fragment) 
>gi|50675 


9e-33 


1804 


U50736 


Rattus norvegicus 
cardiac adriamycin 
responsive protein 
mRNA. complete cds 


2e-84 


1362781 


cytokine inducible nuclear 
protein C193 - human 
>gi|79384t (XS3703) nuclear 
protein fHomo sapiens] 


7e-30 


1805 


AF072865 


Rattus norvegicus 
thioredoxin reductase 
(TrxR2) mRNA. 
nuclear gene 
encoding 
mitochondrial 
protein, complete cds 


2e-84 


3757888 


(AF072S65) thioredoxin 
reductase [Rattus norveaicus] 


6e-43 


1806 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2.4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA. 
complete cds 


6e-85 


4105269 


(.AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norveaicus] 


le-41 


1807 


U19181 


Rattus norvegicus 
RabinS mRNA. 
complete cds. 


2e-S7 


624225 


(U191S1) RabinS [Rattus 
norvegicusl 


2e-41 


1808 


U40342 


Mus musculus ninein 
mRNA. complete cds. 


le-91 


1113865 


(U40342) ninein [Mus 
musculus]' 


2e-36 


1809 


X67877 


R. norvegicus mRNA 
r or cytosolic 
resiniferatoxin- 
bindins protein 


4e-92 


136077 


TROPOMYOSIN BETA 3, 
FIBROBLAST chicken 
>gi|5 15694 (M23082) 
ropomvosin [Callus sallus] 


0.56 


1810 


J 
1 

r 

e 

AF044574 c 


-iattus norvegicus 
>utative peroxisomal 
!.4-dienoyl-CoA 
eductase (DCR- 
\KL) mRNA. 
omplete cds 


5e-93 


( 
I 

4105269 r 


AF044574) putative 
)eroxisomal 2.4-dienoyl-CoA 
eductase [Rattus norvegicus] 


le-50 


181 1 


i 
( 

AF035527 c 


vlus musculus EHF 
Ehf) mRNA. 
omplete cds 


2e-95 


( 

3138930 r 


AF035527) EHF [Mus 
nusculus] 


2e-47 



1>Q i 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlaslX vs. Non-Redund.mr Proteins) 


SEQ 
ID 


ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















r 

1812 


ABO 16930 


Criceiulus griseus 
mRNA for 

Phosphatidylglycerof 
hosphace synthase, 
complete cds 


6e-96 


4159682 


(ABO 16930) 

Phosphatidylglycerophosphate 
synthase [Cricetulus griseus] 


7e-41 


1 1813 




Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


7e-97 


3868778 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus] 


3e-41 


1 18 14 




G.gallus PB I gene 


2e-97 


2134381 


polybromo 1 protein - chicken 
chicken >gi|951231 (X90849) 
polybromo 1 protein [Gallus 
gallus) 


le-34 


1 1 R K 




h-lamp-2=lysosome- 
associated membrane 
protein-2 protein-2b 
(LAMP2) mRNA, 
alternatively spliced 
form h-lamp-2b, 
complete cds. 


3e-98 


<NONE> 


<NONE> 


<NONE> 


1816 


U67203 


Mus musculus ACF7 
neural isoform 1 
(mACF7) mRNA, 
partial cds 


2e-98 


1675224 


(U67204) ACF7 neural isoform 
2 [Mus mu5culusl 


9e-39 


1817 


L14684 


Rattus norvegicus 
nuclear-encoded 
mitochondrial 
elongation factor G 
mRNA, complete cds. 


e-100 


585084 


ELONGATION FACTOR G, 
MITOCHONDRIAL 
PRECURSOR (MEF-G) 
>gi|543383|pir||S407S0 
translation elongation factor G, 
mitochondrial - rat >si|3 10102 


2e-30 


1818 


X84692 


Vl.musculus Spnr 
•nRNA for RNA 
binding protein 


e-133 


1363238 


spermatid perinuclear RNA- 
jinding protein Spnr - mouse 
>gi|673454 (X84692) spermatid 
jerinuclear RNA binding 
Jrotein [Mus musculus] 


5e-35 


1819 


1 

< 
r 

U50736 r 


Rattus norvegicus 
ardiac adriamycin 
esponsive protein 
nRNA. complete cds 


e-113 


t 
1 

1362781 f 


:ytokine inducible nuclear 
jrotein CI 93 - human 
>gi|793841 (X83703) nuclear 
irotein [Homo sapiens] 


2e-36 


1820 


1 
[ 
r 

S66855 r 


-Io.\B9=Ho.\-2.5 
mice, embryos. 
nRNA Partial. 786 
t| 


e-107 


I 

1708355 E 


IOMEOBOX PROTEIN HOX- 
J9 (HOX-2.5) 


Se-37 



WO 01/02568 



PCT/US00/18374 



SEQ 
DD 



Nearest Neighbor (BlasiN vs. Genbank) 



ACCESSION 



DESCRIPTION 



HoxB9=Hox-2.S 



P VALUE 



Nearest Neighbor (BlastX vs. Non-RedundanTp^i 



ACCESSION 



[mice, embryos. 
ImRNA Partial. 786 
J821I S66855 |nt| 



iRattus norvegicus m- 
[tomosyn mRNA, 
18221 U92072 complete cds 



e-108 



1708355 



DESCRIPTION 



ns) 



I p value! 



HOMEOBOX PROTEIN HOX- 
B9 (H OX-2.5> I ^.37 



e-I02 



2e-38 



I Mouse mRNA for 
Ikinesin-like protein 
18231 D175 77 (Kiflb). complete cds 



iMus musculus SDP8 
18241 AF062484 ImRNA. complete cds 



e-129 



[(U92072) m-tomosyn [Rattus 
3790 389 Inor vegicus] 

Ikinesin-like protein" 

IKIF1B mouse 

>gi|407339|gnl|PID|d 1005029 
J(D17577) Kiflb [Mus 
249752 4 |musculus) ) 2 e-39 



(AF0624S4) SDP8 (Mus 
e-122 ( 3126981. Imusculusl 



1825|_X73683 



R.norvegicus mRNA 
|for histone H3.3 



Mus musculus ACF7 
I neural isoform 1 
(mACF7) mRNA. 



e-109 



122075 



J(H3.3Q) histone H3.3 - fruit fly 
KDrosophila melanogaster) 
histone H3.3B - chicken 
>gi|2 1 19023ipirj|S6I2lS histone | 
|H3.3 - fruit fly (Drosophila 
hydei) 1-1361 [Oryctolagus 
IcuniculusJ >gijS046 (X53S22) 
iHistone H3.JQ gene product 
([Drosophila melanogaster] 
>gi|5U98 gallus] >gi(161I90 
(Ml 7376.) histone H3 [Spisula 
Isolidissimaj >gi|21IS53 
I(M1 1393) histone 3.3 [Gallus 
gallus] >gi|306S4S (Ml 1354) 
IH3.3 histone [Homo sapiens] 
ImeJanogaster] >gi|96303 1 
KXS1205) histone H3.3 H3.3A 
Ivariant [Drosophila 
Imelanogaster] musculus! 



5e-40 



2e-40 



1826 U67203 Ipartial cds 



1827 



[Mouse mRNA for 
Ikinesin-like protein 
D'7577 [(Kiflb). complete cds 



e-102 



1675224 



e-131 



2497524 



l(U67204) ACF7 neural isoform 
2 [Mus musculus 



K1NESIN-L1KE PROTEIN 
KIF1B mouse 

>gi|407339|tr.l|PlD|dl005029 
(DI7577) K::lb[Mus 
musculus] 



2e-40 



7e-4; 



7>f0 



WO 01/02568 
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SEQ 
ID 



.Nearest Neighbor fBlasiN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



P VALUE! 



1828| AB016930 



Cricetulus griseus 
mRNA for 

Phosphatidylglycerop j 
hosphate synthase, 
complete cds j e- 13 1 



4159682 



(AB016930) 

Phosphatidylglycerophosphate 
synthase [Cricetulus griseus] 



1829 1 U09874 



Mus musculus SKD3 
mRNA. complete cds.| 



e-122 



2493735 



SKD3 PROTEIN SKD3 [Mus 
musculusl 



1830 1 X99145 



C.fami Maris mRNA 
forC3VS protein 



e-UO 



1429314 



(X99145) overexpressed in 
thyroid tissue after TSH 
stimulation [Canis familiaris] 



7e-48 



18311 X99836 



P.walti mRNA for 
mp associated proteinl 
55 



e-106 



4200286 



(X99836) rap55 [Pleurodeles 
waltl] 



18321 AF0770O3 



Mus musculus SH3 
domain-containing 
adapter protein 
mRNA. complete cds j 



e-121 



3550240 



(AF077003) SH3 domain- 
containing adapter protein; 
CD2AP 



2e-50 



3c-51 



18331 AF060246 



Mus musculus strain 
C57BL/6 zinc finger 
protein 106 (Zfpl06) 
mRNA. H3a-a allele, 
complete cds 



c-1 18 



3372657 



(AF060246) zinc finger protein 
06 [Mus musculusl 



1834| Z 14030 



1835 1 AF0770Q3 



1836 1 L20427 



R.norvegicus mRNA 
for TRAP-complex 



e-120 



1174453 



Mus musculus SH3 
domain-containing 
adapter protein 
mRNA, complete cds : 



e-132 



3550240 



Rattus norvegicus 
dihydroxypolyprenylbl 
enzoate 

methyl transferase 
mRNA. complete cds | e-116 



457372 



KANiLULUN- 
ASSOCIATED PROTEIN, 
GAMMA SUB UNIT (TRAP- 
GAMMA) (SIGNAL 
SEQUENCE RECEPTOR 
GAMMA SUB UNIT) (SSR- 
GAMMA) 

>gi|423I85|piri|S33294 
translocon-associated protein 
gamma chain - rat norvegicus] 



(AF077003) SH3 domain- 
containing adapter protein; 
CD2AP 



(O0427) 

dihydroxypolyprenylbenzoate 
methy [transferase 
dihydroxypoiyprenylbenzoate 
methyltranst'erase [Rattus 
norvegicus | 



Ie-52 



7e-54 



5e-54 



4e-56 



3?« 



WO 01/02568 
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Nearest Neighbor i Bl.-isiN vs. Genbank) 



Nearest Neighbor (BlasiX vs. Non-Redundant Prntsin^r 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



18371 XS0169 



_1838| AF080568 



I839|X99 145 



1840 1 AFQ 19Q75 



M.musculus mRNA 
for 200 kD protein 



Rattus norvegicus 

CTPrphosphoethanol 

amine 

cytidylyltransferase 
mRNA. complete cds 



P VALUE 



ACCESSION 



DESCRIPTION 



e-122 



1717793 



e-119 



3396102 



C.familiaris mRNA 
for C3VS protein 



e-121 



1429314 



Pan troglodytes breast| 
and ovarian cancer 
susceptibility 
(BRCAl)gene, 

partial cds J e-I45 



PROTEIN TSG24 (ME IOTlC" 
CHECK POINT 
REGULATOR) 
>gi|1083553|pir||A551 17 tsg24 



(AF080568) 

CTP:phosphoethanolamine 
cvtidylyltrans ferase 



(X99145) overe.xpressed in 
thyroid tissue after TSH 
stimulation [Canis familiaris] 



2e-56 



2218154 



( AF00506S) breast and ovarian 
cancer susceptibility protein 
Isplice variant fHomo sapiens] 



2e-53 



1841| U55042 



Bos taurus myosin X. 
complete cds 



(U55042) mvosin X [Bos 
e-122 | 1755049 |taurus] 



1842| AJ007780 



Mus musculus mRNAI 
for polyf ADP-ribose) I 
polymerase-2 | e-119 



(AF072521) poly-(ADPribosyl) 
^3283975 [transferase homolog PARP 



4e-62 



1843 | AF072865 



1844 1 U55042 



Rattus norvegicus 
thioredoxin reductase | 
(Tr.xR2) mRNA. 
nuclear gene 
encoding 
mitochondrial 

protein, complete cds I e-105 



Bos taurus myosin X. 
com plete cds 



j(AF072865:i thioredoxin 
3757888 Ireductase [Rattus norvegicus] 



e-121 



(U55042) mvosin X [Bos 
J755049 Itaurus] 



3e-62 



18451 X61506 



Mouse E46 mRNA 
for E46 protein 



e-139 



114909 



|BRAIN PROTEIN E46 



I846| D90335 



Bovine mRNA for 
GTP-binding protein 
Ipha-subunit 



1847 1 U49507 



Mus musculus 
B6CBA Lisoh7 
mRNA. partial cds. 



e-14S 



585 174 



TuUAMNk NUCLt'OTIDE- 
IBINDING PROTEIN, ALPHA- 
14SUBUNIT(GLI) 
>gi| 10871 1 jpirj] A4089 1 GTP- 
[binding protein GL1 alpha chain 

bovine protein, alpha-subunit 
(Bos taurus! 



2e-69 



e-I40 



2121326 



(AC00212S , > Lisch7 [Homo 
Isapiensl 



2e-74 
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Table 4 



SEC 

I 

i 

1 2 
i 3 
1 4 


Neares 

> 

ACCESS IOt 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


Neighbor (BlastN vs. 

< DESCRIPTION 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


Genbanki 

P VALUE 
<NONE> 
<N0NE> 
<N0NE> 
<NONE> 


Nearest Neish 

ACCESSION 
<N0NE> 
<NONE> 
<N0NE> 
<N0NE> 


bor iBIaslX v S . Non-Redundant F 

DESCRIPTION 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


roteinsi 

P VALUE 
<NONE> 

<NONE> 
<NONE> 


5 
6 

7_ 

s~ 

9 

1 10 


<NONE> 
<NONE> 
! <NONE> 
<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 

<NONE> 


<NONE> 
<NONE> 
<NONE> 
<NUiNt> 

<NONE> 
<ri ( jNfc> 


<NONE> 
! <N0NE> 
<N0NE> 
<N0NE> 
<N0NE> 
<NONE> 


<NONE> 
L <NONE> 
<NONE> 
<NONE> 
<NONE> 
<NONE> 


<NONE> 

<NONE> 
<NONE> 
<NONE> 
<NONE> 


11 
12 
j 13 


<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 


<NONE> 
<NONE> 
<NONE> 


<NONE> 
<N0NE> 
<N0NE> 


<NONE> 
<NONE> 
<N"ONE> 


<NONE> 
<NONE> 


l] 14 
I 15 
1 16 


T <NONE> 
<NONE> 
<NONE> 


<NONE> 
<N0NE> 
<NONE> 


<NONE> 
<N0NE> 
<NONE> 


<N0NE> 
<N0NE> 
<NONE> 


<NONE> 
<N"ONE> 
<NONE> 


<NONE> 
<NONE> 


17 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<NONE> 


<N'ONE> 


IS 
19 


<NONE> 
<NONE> 


<N0NE> 
<NONE> 


<N0NE> 
<NONE> 


<N0NE> 


<N'ONE> 




20 
21 


<NONE> 
<NONE> 


<NONE> 
<N0NE> 


<N0NE> 
<NONE> 


<NONE> 
<N0NE> 


<NONE> 
<\'ONE> 


<NONE> 

i ' \~j i > r. i> 


1 22 


<NONE> 


<NONE> 


<NONE> 


<N0NE> 
<NONE> 


<NONE> 
<NONE> 


<NONE> 




<NONE> 


<N0NE> 


<NONE> 


1079469 


tMDC 1 protein - crab-eatine 
macaque 


9.3 


24 


<NONE> 


<NONE> 


<NONE> 


3043656 


(ABO 11 13S) KIAA0566 protein 
Homo sapiens] 


9.3 


25 


<NONE> 


<NONE> 


<NONE> 


112175 


potassium channel protein RK5 - 
rat protein [Rattus norveaicus] 




26 


<NONE> 


<N0NE> 


<NONE> 


3769624 


(AF091565) olfactory receptor 
Rattus norveaicus] 


7.2 


27 


<NONE> 


<NONE> 


<NONE> 


3876443 


(Z81517)F2SB1.6 
Caenorhabditis eleaans] 


7.1 


28 


<NONE> 


<NONE> 


<NONE> 


2224464 


r AB0OI684) ORF249 [ChloreMa 
vulgaris] 


6.9 




<!NUNh> 


<N0NE> 


<NONE> 


< 

1519707 < 


U67940) ORFveglOo; random 
:DNA sequence [Dictyostelium 
liscoideunil 


6.7 


30 


<NONE> 


<NONE> 


<NONE> 


1 

227491 1 


irotein kinase C 11 [Xenopus 
aevisj 


6.7 


31 


<NONE> 


<NON'E> 


<NONE> 


( 

630575 ( 


:50C3.4 protein - 
7aenorhabditis eieaans 


6.0 


32 


<N0NE> 


<NONE> 


<NONE> 


3 
c 
> 
k 
i 

137290 v 


5 KD PROTEIN fts' RN'a2 
lover necrotic mosaic virus 
gi|61466 (XOS02H ORF tor 35 
Da polypeptide ( AA l-317i 
?cd clover necrotic mosaic 
irus) 


6.0 



313 



WO 01/02568 



PCT/US00/18374 



ACCESSION 



33 



34 I <NQNF;, 



35 



36 I <NONF> 



37 I <NQNE> 



<NONE> 
<NONE> 



<NONE> 



<NONE> 



DESCRIPTION 



30O4I 



<NONE> 



2493585 



<NONE> 



40 I <NONH> 



_fJ_J<NONE> 



42 | <NQNE> 



43 I <NONF> 



.44 | <NQNE> 



<NONE> 
<NONE> 



<NONE> 



<NONE> 
<NONE> 



NHRUGfcN REGULATOR Y~ 
-1182918 jPRQTEIN A REA 

■JIwmul'HuNDRIaL 

ribosomal protein s5 

Emericella nidulans 
mitochondrion (SGC3) 
>gi|12709 nidulans] >gi|472822 
(JO 1390) unknown pro te i n 
KALOMm) predicted using — 
JGenefinder; similar to WD 
[domain. G-beta repeat; cDNA 
EST yk362f7.5 comes from this 
gene; cDNA EST yk362f7.3 
[comes from this gene 
[fCaenorhabditis elee'ansl 



3979943 
950203 



<NONE> 



<NONE> 



<NONE> 



45 I <NONE> 



<NONE> 



<NONE> 



<NONE> 



cNONE> 



3560232 



[(AL031530) hypothetical zinc 
[finger protein 

[[Schizosaccharomyces pombel 



73007.1 



I B l\y 

AXONEME-ASSOCIATED 
IPROTEIN MST10I(1) product 
.[Drosophila hv deil 
iHVPOl'HE-ricJALil.y KD 
PROTEIN IN INTE-PIN 

[INTERGENIC REGION 
>gi| 1 787402 (AE000214) orf, 

[hypothetical protein 

[[Escherichia colil 



3511232 



1 150900 



3876099 



1(AF071556) anthranilate 
dioxygenase large subunit 



(U43139) envelope glycoprotein 
gpl20 [Human 

immunodeficiency virus type 1] 
(273DJ0) similar to dynei'n 
heavy chain; cDNA EST 
EMBL.D27549 comes from this 
gene; cDNA EST 
EMBL:D34859 comes trom this 
gene [Caenorhabditis elegansl 



5.7 



4.3 



4.0 



3.3 



3.0 



2.6 



2.5 



2.4. 



1.9 



1.4 



W ° 01/02568 PCT/US00/18374 



1 SE Q 


— Nearest Neiphhnr fRi ls tN vs. Genbank i 


Nearest Neiehhnr rRM„v ... „ ~. : . 


ID 


ACCESSIC 


)N| DESCRIPTION 


P VALUE 1 


ACCESSION 


„ s . i>un-Keaunaant 
DESCRIPTION 


Proteins) 
P VALUE 


1 47 1 <NONF-» 


j <NONE> 
<NONE> 


<NONE> 1 
<NONE> f 


3881150 
132200 


_ (AL032647) predicted usipp 
Genefinder ~ ... .... 

CULANtC ACJDCAPSLfLAI. 
BIOSYNTHESIS 
ACTIVATION PROTEIN A 
>gi|95605|pir||S 17701 rcsA 
protein 


I~4 

c 


1 4 8 1 <NONFi 


<NONE> 


<NONE> 1 


2204286 


(U01 j80) germination protein 
Bacillus meaaterium) 


1.1 1 

1.0 1 


1 49 1 <NONE> 


<NONE> 


<NONE> 1 1723955 


HYi-uihLbll(_AL 1 1.4 KB — 
PROTEIN IN FOXI-KEX1 
INTERGENIC REGION 
>gi|2l32566|pir||S64222 
probable membrane protein 
i ULJU4c - yeast 
(Saccharomyces cerevisiae) 
>gi|l322838|gnl|PID|e243803 
CZ72726) ORF YGL204c 
'Saccharomyces cerevisiae] 




I 50 I <NONE> 


<NONE> 


<NONE> 1 3201564 


(AJ006314) prolipoprotein 
diacylglyceryl transferase 
Vibrio cholerael 


0.84 J 
0.31 1 


51 1 <NONE> 


<NONE> 


<NONE> I 5R0R7-M 


AL02I428) hypothetical 
protein Rv0064 


0.27 1 


1 52 I <NONF> 


<NONE> 


<NONE> j 


602434 


(U179S6) GABA/noradrenaline 


0.13 1 


[ JJ 1 <-iNLHNc> 


<NONE> 


<NONE> 1 
<NONE> J 


1 

3347955 , 


transporter [Homo sapiens] 
AK>/6I84) cytosolic sorting 
JroteinPACS-lb[Rattus 
orvetricus) 


0.12 j 


54 1 <NONE> 


<NONE> 


e 

t* 

y 

e 

c 

y 

el 

cc 

12558S7 vk 


ujjj44J coaea tor by (_'. 

legans cDNA yk92b4.5: coded 
or by C. elegans cDNA 
k73al.5; coded for by C. 
egans cDNA,ykl02e9.5; 
>ded for by C. elegans cDNA 
t7IcS.5; coded for by C. 
egans cDNA yk66dl 1.5; 
ided for by C. elegans cDNA 
66c3... 


0.074 1 


55 J <NONF^ 


<NONE> < 


:NQNE> 1 


Bl 
re 

C5 

103076 m e 


vm-like sex-determining 
jion hypothetical protein 
'314- fruit fly (Drosophila 




56 1 <NONE> 


<NONE> < 


NONE> J 


Ra 

107560 [h» 


lanogaster) 

s inhibitor (clone JC265) - 
nan sapiens) 


0.003 J 
0.002 | 



^13 



WO 01/02568 



PCT/USOO/18374 





Nearest Neighbor CBIasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












Bkm-like sex-determining 




57 


<NONE> 


<NONE> 


<NONE> 


103076 


region hypothetical protein 
CS3 14 - fruit fly (Drosophila 
melanogaster) 


2e-04 


58 


<NONE> 


<NONE> 


' <NONE> 


2702370 


fAF038604^ cnnfntns ctmilnrirv 
to Drosophila ovarian tumor 
locus protein (GB:X13693) 
[Caenorhabditis elegans] 


6e-05 


59 


<NONE> 


<NONE> 


<NONE> 


3859713 


(AL033501) phox domain 
protein [Candida albicans] 


3e-05 


60 


<NONE> 


<NONE> 


<NONE> 


2088839 


(AF003386) F59E12.5 gene 
product [Caenorhabditis 
elegans] 


2e-08 


61 


<NONE> 


<NONF> 


<NONE> 


121059 


GC-RICH SEQUENCE DNA- 
B ENDING FACTOR GCF . 
human >gi|179412 (M29204) 
DNA-binding factor [Homo 

C inipncl 

oupiciu> J 




62 


<NONE> 


<NONE> 


<NONE> 


3875246 


(/S14yU) similar to WD 
domain, G-beta repeats (2 
domains); cDNA EST 
EMBL:T00482 comes from this 
gene. cDNA EST 
EMBL:T0O923 comes from this 
gene; cDNA EST yk449d4.3 
comes from this gene; cDNA 
F^T vk~44QH4 5 rnmp? from rhis 
gen... 


9e-24 


63 


<NONE> 


<NONE> 


<NONE> 


1465834 


(U64857) No definition line 
found (Caenorhabditis elegans] 


9e-28 


64 


<NONE> 


<NONE> 


<NONE> 


3327136 


(AB014561) KIAA0661 protein 
[Homo sapiens] 


le-29 


65 


<NONE> 


<NONE> 


<NONE> 


3880433 


(Z66521) similar to 
mitochondrial RNA splicing 
MSR4 like protein; cDNA EST 
EMBL:C09217 comes from this 
gene [Caenorhabditis elegans] 


8e-31 


66 


D42133 


Rat annexin V gene, 
exon7 and cxon8 


5.0 


<NONE> 


<NONE> 


<NONE> 


67 


L35679 


-lomo sapiens 
[subclone H8 2_dl 1 
from PI 35H5C8) 
3NA sequence. 


5.0 


1086902 


(U4I2/y) coded tor by L'. 
elegans cDNA yk79g8.5; coded 
; or by C. elegans cDNA 
cm lOcS; coded for by C. elegans 
:DNA yk79gS.3; similar to 
eucinerrich repeats found in 
many proteins [Caenorhabditis 
ilegans] 


6.6 



WO 01/02568 



PCT/USOO/18374 





SEC 
ID 


Nearest 

> 

ACCESSIOI- 


Neighbor (BlastN vs. 

J DESCRIPTION 
HIV- 1 strain BX220 


□enbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor fBlastX vs. Non-Rcdundam P 
DESCRIPTION 


roteins) 

p value 


68 


U90184 


from USA, envelope 
glycoprotein C2V3 
region (env) gene, 
partial cds 


5.0 


1297070 


(Z71986) convicilin precursor 
[Vicia narbonensisl 


6.6 


69 


U61465 


Human myosin Vila 
(MY07A) gene, 5 P 
exon 37 


5.0 


2313225 


(AE000535) L-lactate permease 
(IctP) [Helicobacter pylori 
26695] 


5.0 


70 


AF013717 


Homo sapiens 
periplakin (PPL) 
mRNA, partial cds 


5.0 


' ■ 3719238 


(AF064869) brain-enriched 
guanylate kinase-associated 
protein 2; BEGA2 [Rattus 
norvegicus] 


3.8 


71 


X58245 


Soybean mRNA for 
HMG-1 like protein 


5.0 


2995363 


(AL022245) biotin synthase 


0.99 


72 


AF 1 02425 


Frasera paniculata 
tRNA-Leu (trnL) 
gene, intron. 
chloroplast sequence 


4.9 


3522958 


(AC004411) putative 

p^** 11 llv A Lb I dOv ^ / VI UUIUUUola) 

thaliana] 


6.4 


73 


X82817 


H. sapiens 

PTP1C/HCP- variant 
gene 


4.9 


3875514 


(Z814y4J cUN/TESl 

EMBL:D27474 comes from this 
gene; cDNA EST 
EMBL.-D27473 comes from this 
gene; cDNA EST 
EMBL.T00471 comes from this 
gene; cDNA EST 
EMBL.D34192 comes from this 
gene; cDNA EST 
EMBL:D37241 comes from this 
»ene; ... 


2.8 




74 


I 
t 

U04827 p 


vlus musculus brain 
atty acid-binding 
irotein 


4.9 


( 
( 
t 
< 
r 

3676132 


(AL031765) 1- 

:vidence=predicted by content; 
l-method=genefinder;084; 1- 
Tiethod_score=31.96; 1- 
:vidence_end; 2- 
:vidence=predicted by match; 2- 
natch_accession=SPTREMBL: 
293319; 2- 

natch_description=HYPOTHE 
riCAL PROTEIN C33A1 1.2.;... 


2e-09 




75 


r 

s 
t 

AF03SS59 c 


<Jeospora hughesi 
train NE1 internal 
ranscribed spacer 1, 
omplete sequence 


4.8 


<NONE> 


<NONE> 


<NONE> 



317 
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Nearest Neighbor (BlastN vs. Genhnnlc^ 



SEQ 

ID 1 ACCESSION I DESCRIPTION 



76 



80 



81 



82 



Y08222 



77 | AJ224475 



M.musculus MFH- 1 



Nearest Netghbpr (BlastX vs. Non-Redundant Pr^i^T 



P VALUE | ACCESSION 



gene 

Borrelia burgdorferi 



left chromosomal 
subtelomeric region 
(prpB gene) 



[Mus musculus LAP 
putative membrane 
protein (KRAG) 
Igene, e.xon 3 and 
78 J UQ2486 complete cds 



DESCRIPTION 



4.8 



<NONE> 



4.8 



4218141 



79 | ABO0O280 



IRat mRNA for 
Ipeptide/histidine 
(transporter, complete 
cds 



Z49771 



lA.cepa mitochondrial 
Igene for NADH 
[dehydrogenase' 
[subunit 3 and 
Iribosomal protein 
S12 



M63494 



Mouse IgG receptor 
(beta-Fc-gamma-RII) 
gene, exons 6 and 7, 
clones lambda- 
JFc(3.2,93). 



P VALUE 



<NONE> 



(AJ236702) HMR1 protein 
[Antirrhinum majusl 



4.8 



3258103 



4.8 



806317 



4.5 



<NONE> 



(AP000006) 367aa long 
hypothetical protein 
[Pyrococcus horilcoshiil 



(M29067) unknown protein 
[Saccharomyces ccrcvisiael 



<NONE= 



8.3 



0.001 



<NONE> 



4.3 



<NONE> 



Z14035 [S.pombe carl gene 



2.0 



3790665 



<NONE> 



<NONE> 



(AF099000) No definition line 
found [Caenorhabditis elegansl 



<NONE> 



83 



Rhodococcus 
lerythropolis ThcA 
l(thcA) gene, complete 
Jcds; and unknown 
U17129 Jgenes 



84 | AE001386 



Plasmodium — 
I falciparum 
chromosome 2, 
section 23 of 73 of 
the complete 
sequ ence 



2.0 



2828280 



(AJL021687) putative protein 

Arabidopsis thaliana] 
>gi|2832633|gnl|PID|e 124965 1 
(AL02171 1) putative protein 

Arabidopsis thaliana] 



2e-26 



2.0 



85 



86 



Human clone 23734 
U79292 I mRNA sequence 



4176500 



(AL031 177) dJ889M15.3 (novel 
protein) o e _59 



IChloroplast Euglena 
gracilis gene coding 
for the 5S and 16S 
V00159 |rRNA. 



1.9 



<N0NE> 



<N'ONE> 



1.9 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



b[<L 
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SEQ 
ID 



87 



88 



90 



91 



92 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



Nearest Neighb or (BlastX vs. Non-Redundant ProTei^i 



J/95094 
X93206 



89 | U60979 



X56272 



L22383 



DESCRIPTION I P VALUE | ACCESSION 



Xenopus laevis XL 
INCENP (XL- 
INCENP) mRNA 
complete cds 
H.salinarium TATA 
box-binding protein 
genes and ORFs 



Caenorhabditis 
elegans programmed 
cell death specifier 
(ces-2) gene, 
complete cds 



C. tentans ORFs (A- 
E) for hemoglobin 



1.9 



1.9 



<NONE> 
<NONE> 



1.9 



<NONE> 



Homo sapiens DNA 
sequence, repeat 
region. 



1.9 



<NONE> 



Hirudo medicinalis 
neuron-specific 
[protein mRNA, 
U82814 [complete cds 



[Haplomitrium ~ 
Jhookeri 18S rRNA 
Igene, partial 
U 1 8504 [sequence. 



Kseudomonas stutzeri 
[nosDFY genes 
[involved in copper 
X53676 [processing 



Dictyostelium 
|discoideum multidrug 
resistance 
j transporter/Ser 
protease (tagC) 
U60086 [mRNA. complete cds 



1.9 



<NONE> 



DESCRIPTION 



P VALUE! 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



1.9 



3822533 



(AF094531) immunoglobulin 
heavy chain precursor 



1.9 



1083969 



1.9 



298078 1 



Human putative G 
[protein-coupled 
receptor (GPR17) 
_U33447 [gene, complete cds 



1.9 



3879530 



[Sus scrofa lactoferrin 
mRNA, complete cds. 
> :: gb|I2S421|I2842I 
[Sequence 5 from 
M81327 [patent US 5571691 



1.9 



3880034 



1.8 



<NONE> 



hypothetical protein 6 - fowlpox 
virus virus! 



<NONE> 



<NONE; 



<NONE> 



<NONE> 



<NONE> 



2.0 



2.0 



(AL022198) putative protein 



(Z49130) cDNAEST 
yk486b9.3 comes from this 
gene; cDNA EST yk486b9.5 
comes from this gene 



(Z75550) similar to cell division 
control protein [Caenorhabditis 
elegans] 



0.70 



6e-05 



7e-14 



<NONE> 



<NONE> 
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Nearest Neiohhnr iRi-.«w „.. <~. — i — 1 ~ 


SEQ 
ID 


ACCESSIO 


N DESCRIPTION 
S.iniae IctP & IctO 


■ iicdic^i neis 

P VALUE 1 ACCESSION 


noor (HiastA vs. Non-Redunri.mt 
DESCRIPTION 


'roteins) | 

P value] 


98 1 Y07622 

99 I M60474 


genes and ORF1 

Mouse myristoylated 
alanine-rich C-kinas< 
substrate (MARCKS 
mRNA. complete cdi 


1-8 1 <NONE> 
1 

) 1 

>• 1-8 | <NONE> 


<NONE> 
<NONE> 


<NONE>| 
<NONEi» 1 


100 1 Y13901 


Homo sapiens FGFR 
4 gene 


1.8 i <;Nni\rp-!. 


<NONE> 


<NONE> 1 


101 1 U44400 


Human Down 
Syndrome region of 
chromosome 21, 
clone A31D6-1D6. 


1.8 1 erNTONTP-v 


<NONE> 


<NONE>| 


102 I U92808 


Ruminococcus albus 
beta-glucosidase 
(gluA) mRNA, 
complete cds 


1-8 I <NONE> 


<NONE> 




103 1 L25051 


Candida albicans 
argininosuccinate 
lyase (ARG4) gene, 
complete cds. 


i.S | <NONE> 


<NONE> I 


<NQNE>[ 
<NONE> 1 


104 1 AE000546 


Helicobacter pylori 
26695 section 24 of 
134 of the complete 
genome 


1.8 1 ^ndmpn. 


<NONE> 


<NONE> I 


1 

105 1 JO0978 


Xenopus laevis major 
jeta-globin gene, 
:omp!ete cds. 


1.8 S <NONE> 


<NONE> 


<NONE> 1 


1 i 

1 A 
1 J 

106 1 U41716 r 
1 C 


uman 
mmunodeficiency 
'irus type 1 isolate 
W95-5, vpr gene, 
omplete cds. 


1.8 | <NONE> 


<NONE> 


<NONE> 1 


107 1 X66286 t, 


}.gallus mRNA for 
:nsin 


1-8 J <NONE> 


<NONE> 




1 E 

108 1 U76636 r 


Cenopus calbindin 
>28k mRNA, 
Dmplete cds 


1-8 J <NONE> 


<NONE> 


;NONE> 




109 1 J00664 4 


bbit embryonic beta- 
globin gene. 


1-8 I <NONE> 


<NONE> 


:NONE> 




H 

(e 

HO | M21535 m 


uman erg protein 
ts-related gene) 
RNA. complete cds. 


1 (/ 

1-8 1 2983160 pr 


VE000693) hypothetical 
otein f Aquifcx aeolicusl 


7.7 
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PCT/US00/18374 



ACCESSION 



111 | M80829 



DESCRIPTION 



Rat troponin T 



cardiac isoform gene 
complete cds 



P VALUE 



ACCESSION 



DESCRIPTION 



ins) 



P value! 



112 



ICyprinus carpioc- 
Imyc gene for c-Myc, 

D37887 |complete cds 

Inorrio sapiens U 
(protein-coupled 
(receptor kinase 1 and 
|G protein-coupled 
Ireceptor kinase lb 
|(GRK1) gene, 
alternatively spliced 
(alternative exon 6, 
exon 7, and partial 
cds 



1.8 



,999450 



1.8 



3023408 



BRANCHED-CHAIN AMINO 
ACID TRANSPORT SYSTEM 
I CARRIER PROTEIN 
[(BRANCHED CHAIN AMINO 
[ACID UPTAKE CARRIER) 
>gi| 1 075007|pir||D64056 
[membrane-associated 
(component, branched amino 
acid transport system (bmQ) 
Ihomolog - Haemophilus 
(influenzae (strain Rd KW20) 
/system n carrier protein (brnQ) 
((Haemophilus influenzae Rdl 



!13 I AF019765 



114 | AF025967 



(Helicobacter pylori 
|J166 virulence 
(regulon 
(transcriptional 
(activator homolog 
(gene, partial cds, 
(strain-specific 
[genomic sequence B2 



.115 I U13183 



(U10270) G-box binding factor 
( 1 [Zea mays] 



1.8 



3850108 



(AL033388) putative calcium- 
( transporting atpase 

IfSchizosac charomvces pombe] 
(FKOUABLk ~ 



7.3 



2494853 



(HYDROXY AC YLGL UTATHI 
(ONE HYDROLASE 
(GLYOXALASE II) (GLX n) 
(protein [Escherichia coli] 
>gi| 1786406 (AE000130) 
probable 

hydroxyacylglutathione 
Ihydrolase [Escherichia coli) 



7.2 



5.7 



5.5 
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SEQ 
ID 



Nearest Neighbor (BlasiN vs. Genbank) 



ACCESSION 



DESCRIPTION 



116 



S68944 



'17 | M92905 



U8 I XI2429 



Nearest Neighbor (BlastX vs. Non-Redundant Pm~^T 



P VALUE 



ACCESSION 



Na+/CI(-)-dependent 
neurotransmitter 
transporter 



Rat calcium channel 
alpha- 1 subunit(rbB 
I) mRNA, complete 

cds. 



Xenopus laevis Ul 
70K gene exon 10 



119 



DS3333 



120 | AFO 16972 



Mouse hepatitis virus 
genomic RNA for 
spike protein, partial 
cds 



Cervus elaphus 
REDDEER 
mitochondrial D- 
loop, complete 
sequence 



121 I AB010741 



122 



U32844 



DESCRIPTION 



1.8 



2276316 



P VALUEl 



(Z96810)GLYT-1LIKE [Homo 
sapiens] 



1.8 



3165522 



1.8 



2735957 



(AF067607) Similar to cuticular 
collagen; C18H7.3 



(AFO 15685) reverse 
transcriptase domain protein 



1.8 



3876559 



1.8 



3878057 



Oncorhynchus mykiss 
mRNA for rtSo.\24, 
complete cds 



Haemophilus 
nfluenzae Rd section 
159 of 163 of the 
complete genome 



1.8 



1730805 



1.8 



72S910 



transcriptase domain protein 
Ki^siKiil) jiiiuijjuy it Huma n 

cyclin A/CDK2-associatd 

protein P19 (RNA polymerase 

elongation factor) 

(SW:SKP1_HUMAN); cDNA 

EST EMBL:T001 14 comes 

from this gene; cDNA EST 

yk390fl 1.5 comes from this 

gene; cDNA EST yk402el 1 .5 

co... 

>gi|38772 1 6|gnl|PID|e 1 346850 
protein P19 (RNA polymerase 
elongation factor) gene; cDNA 
EST yk390fl 1.5 comes from 
this gene; cDNA EST 
yk402ell.5 co. 



(299942) similar to von 

Willebrand factor type A 
domain; cDNA EST yk412d4.5 
comes from this gene; cDNA 
EST yk4I2d4.3 comes from this 
gene 



'H V PU 1 HE 1 1L AL 2770 K.U ' 
PROTEIN IN RPS3-PSD1 
INTERGENIC REGION 
>gi|2I32762|pir||S63129 
probable membrane protein 
YNL174w - yeast 
(Saccharomyces cerevisiae) 
>gi|1302152|gnl|PID|e23954S 
(Z71451) ORF YNLI74w 
'Saccharomyces cerevisiae] 



A-TYPE INCLUSION 
PROTEIN (ATI) camelpox 

irus >gi|6238l (X69774) 
84kDa A-type inclusion protein 
unidentified! 



5.5 



5.5 



3.3 



3.3 



2.5 



1.9 
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PCT/US00/18374 



nearest N Cigh bor (Bla.st N vs. Gen banl c) I N eafest Neighhnr , RI ,„y vs , lNon , Redunda „, 



ACCESSION 




DESCRIPTION 



P VALUE 



ribosomal protein YS7 homolog 
Ernericclla nidulans 



filaggrin precursor - mouse 
(fragment) 



(f ragment) 

rttuUABLL PROTEIN 



DISULFIDE ISOMERASE P5 
PRECURSOR >gi|1065461 
(U404H) Similar to protein 
disulfide-isomerase. 
[Caenorhabditis elegansl 



1.4 



0.87 



kfcuULATORV pkOTEIN 

BRLA (BRISTLE A PROTEIN) 
>gi|837l8|pir||A28913 
regulatory protein brlA 
Emericella nidulans >gi|I68029 
(M20631) brlA protein 



0.87 



metalloproteinase I (EC 3.4.24. 
) - human 



(AF 109907) SI 64 [Homo 
sapiens] 



(L13442) cysteine-rich extensin- 
like protein-4 [Nicotiana 
tabacum] 



(AJ010792) Muc5AC protein 
[Mus musculus] 



(AE000766) enolase- 
I phosphatase E-l [Aquifex 
aeolicusl 



0.84 



0.65 



0.58 



0.52 



0.38 



0.095 



3±$ 
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WO 01/02568 
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1 L Nearest Neighbor mi^isr n^...,.^ 1 77 - . 


SEQI 

10 Iaccessn 


DN| DESCRIPTION 
(Plasmodium 


, ■ i^jim iiq; 

P VALUE 1 ACCESSION 


noor (BlastX vs. Non-Redunri.im 
DESCRIPTION 


Proteins) 
P VALUE 


141 I AEOOId^f 


(falciparum 
(chromosome 2, 
[section 67 of 73 of 
the complete 
) (sequence 


1-8 1 1931647 


(U95973) endomembrane 




142 1 L1970R 


(Rat N-methyl-D- 
Jaspartate receptor 
(NMDAR1) gene, 
(first exon. 


!-8 I 1731181 


n I ru 1 hit 1 it_ AL /i>.s KD 
PROTEIN C14A4.3 IN 
CHROMOSOME II 
>gi|3874230|gnl|PID|e 135 1618 
protein (Swiss Prot accession 
number P38376); cDNA EST 
yk220el0.5 comes from this 
_ gene [Caenorhabditis elegansl 


2e-20 

(• 

3e-2l 1 


143 1 Y 10728 


P.schwarzi 
mitochondrial cytb 
(gene, panial 


_ 1-8 1 3878644 


(Z81 103) predicted using 
Genefinder; cDNA EST 
yk303gll.5 comes from this 
gene; cDNA EST yk303gl 1.3 
(comes from this gene 

fCaennrh.nhriiric »l*»ryo n c-i 


le-28 1 


144 I AB006631 


(Homo sapiens mRNA 
for KIAA0293 gene, 
[partial cds 


1 (AL031 177) dJ889M15.3 (novel 
1.8 1 4l7fisnr. 1 :_x 


7e-45 I 


145 I AF1 06967 


Mus musculus 13 
protein mRNA, 
complete cds 


LZ 1 <NONE> 


' <NONE> 


<NONE> 1 


i l^u^nucoglODus 
I Jfulgidus section 34 of 
1 1 172 of the complete 
146 1 AE0O1073 penomr 


-'■7 1 <NONE> 


<NONE> 




1 i 
1 1 
I c 

1 F 

1 8 
1 a 

1 P 

I ' d 
1 h 

147 1 U 12977 r. 


emoignei poly(3- 
lydroxybutyrate) 
epolymerase A 
recursor (phaZ5) 
ene, complete cds, 
nd glycerol-3- 
iosphate- 
ehydrogenase 
□molog, complete 
Is. 


'•7 1 <NONE> 


<NONE> <r 


<NONE>| 
NONE> 1 


J M 

1 (S 
1 g« 

148 1 M27038 ge 


us musculus 
K/CamRk) 
rmline IgK chain 
ne. J 1-5 region. 


1-7 | <NONE> 1 


<NONE> < 


VONE>| 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redunrt.-int Prnf^n^ 


SEQ 
ID 


ACCESSION 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


p vat r rp 






H.sapiens HBF- 1 










149 


X74I42 


mRNA far 
transcription factor 


1.7 


<NONE> 


<NONE> 


<NONE> 


150 


U40830 


streptococcus 
thermophilus DeoD 
gene, partial cds and 
EpsA, EpsB. EpsC, 
EpsD. EpsE. EpsF. 
EpsG, EpsH, EpsI, 
EpsJ, EpsK, EpsL, 
EpsM, Orf 14.9 
protein genes, 
complete cds 


1,7 


<NONE> 


<NONE> 


<NONE> 


151 


L29172 


Rabbit Ig germline 
gamma H-chain 
(allotype dl2,el5) C- 

region ppnp "V enri 


1 7 
i • / 




<N(JNfc> 


<NONE> 


152 


M19045 


Human lysozyme 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


153 


AE001 159 


Borrclia burgdorferi 
(section 45 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 


154 


L17027 


Plasmid pFdA (from 
Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 
stem loop. 


1.7 


' <NONE> 


<NONE> 


<NONE> 


155 


U12232 


Arabidopsis thaliana 
Columbia GTP 
binding protein beta 
subunit (AGB1) 
mRNA, complete cds. 


1.7 


<NONE> 


<NONE> 


<NONE> 


156 


i 
i 
1 

D42056 c 


Arabidopsis thaliana 
ATPK6 mRNA for 
ibosomal-protein S6 
:inase homolog. 
■omplete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


157 


I 
1 

X981I7 p 


Ihizobium 

eguminosarum prsD, 
>rsE, ORF3 genes | 


1.7 


<NONE> 


<NONE> 


<N0NE> 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


| Nearest 
1 1 

I accessioi 


Neighbor (BlastN vs. 
>f DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neiah 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


158 


J AF039084 


Spinacia oleracea 
heat shock 70 proteir 
protein, complete cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


159 


1 Z12651 


R.norvegicus gene fo 
catechol 

methy (transferase 


r 

1.7 




<NONE> 


<NONE> 1 


160 


1 AF002968 


Fringilla coelebs 
mitochondrial contro 
region, partial 
sequence 


1.7 


<NONE> 


<NONE> 


<NONE> 1 


161 


1 AE001160 


Borrelia burgdorferi 
(section 46 of 70) of 
the complete genome 


1.7 


<NONE> 


<NONE> 


<NONE> 1 


162 


U67553 


Methanococcus 
jannaschii section 95 
of 150 of the 
complete aenome 


1.7 


<NONE> 


<NONE> 


<NONE> I 


163 


' M 86247 


S.ruminantium 
plasmid pS23 DNA. 


1.7 


<NONE> 


<NONE> 


<NONE> 1 


164 


S74436 


oIL-8=interleukin-8 
[sheep, spleen ceils, 
mRNA. 1435 nt] 


1.7 




<NONE> 


<NONE> 1 


165 


D12719 


Candida maltosa 
ALK7 (CYP52A10) 
and ALKS complete 
cds 


1.7 


<NONE> 


<NONE> 


<NONE> 


166 1 


U02625 


Geotrichum 
candidum NRRL Y- 
553 lipase gene, 
partial cds. 


1.7 


321245 


230k bullous pemphigoid 
antigen BPM1 - mouse 


9.3 J 


167 1 


Z58881 


H.sapiens CpG DNA, 
:lone 114a4, reverse 
ead cpal 14a4.rtla . 


1.7 


1854675 | 


U66298) bone morphogenetic 
?rotein-6 [Rattus norveajcus] 


9.1 


16S | 


/ 
t 
t 

U43674 a 


^grobacterium 
nmefaciens conjugal 
'ansfcr region 1 
enes 


1.7 


1 

I 
I 
1 
I 
( 

3 

a 

1352066 f 


-AKLrb PKULlNb-KICH 

'ROTE IN BAT2 MHC class III 
listocompatibility antigen HLA- 
}-associated transcript 2 - 
uman >gi| 179339 (M33509) 
ILA-B-associated transcript 2 
BAT2) [Homo sapiens] 
gi|179345 (M33518) HLA-B- 
ssociated transcript 2 (BAT2) 
Homo, sapiens] 


9.1 1 



WO 01/02568 
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err 

ID 


Nearest 
ACCESSIOf 


Neighbor (BlastN vs. 
j DESCRIPTION 


Gen bank) 
P VALUE 


Nearest Neiph 
ACCESSION 


bor fBlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


169 


1 AL023827 


Caenorhabditis 
elegans cosmid 
YI2A6A, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


731440 


PKU I UPUKPH Y KlNULrbtM 

" OXIDASE (PP"0> yeas! 

(Saccharomyces cerevisiae) 
>gi|603606 (U18778) Heml4p: 
protoporphyrinogen oxidase 

1 SilCCriiirom vfAc ^rpuiciiAi 

>gi| 1403536|gnI|PID|e249333 
(Z71381) protoporphyrinogen 
oxidase [Saccharomyces 
cerevisiae] 


8.9 


170 


' X69662 


X.laevis mRNA for 
glutathione 
synthetase, large 
subunit 


1.7 


4038057 


(AC005897) hypothetical 
protein [Arabidopsis thaliana] 


8.8 


171 


Z35824 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBL063w 


1.7 


3021450 


(Y15515)prdl-a [Hydra 
vulgaris] 


7.0 


172 1 


i 

M65139 c 


Cowpea chlorotic 
•nottle virus (CCMV) 
a protein gene, 
•omplete cds. 


1.7 


< 

2506307 ; 


LULLAGfcN ALPHA' TfXiri — 

CHAIN PRECURSOR l(XII) 

chain - chicken 

>gi|222 8 1 1 |gnl|PID|d 1 00 1 1 60 

gallus] 

>gi|2326442|gnl|PID|e39435 
X61024) collagen type XII 
llpha 1 chain fGallus aallusl 


7.0 


173 1 


E 

C 
c 

X15065 r. 


Jrosophila distal BX- 
' region (bithorax 
omplex) pHlS9 5' 

:gion; . 


1.7 


1 
I 
I 

C 
> 

1723625 h 


-lYPOTHjbTIL'AL itfiTKD 

'ROTEIN IN ALPA-GABD 
NTERGENIC REGION (F87) 
►gi| 1033 124 (U36840) 
3RF_f87 [Escherichia coli] 
gi| 1788982 (AEO00348) orf, 
ypothetical protein 


6.9 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbankl 



accession! description 



.NearcM iNe.phhpr (BlastX vs. No n-Redundant Prote~ 
ACCESSION 




(Z46792) similar to lethal(I) 
discs large- 1 tumor suppressor 
protein-like repeats; cDNA EST I 
EMBL.D33495 comes from this I 
gene; cDNA EST 
EMBL:D35 1 17 comes from this I 
gene; cDNA EST 
EMBL.D36356 comes from this I 
gene; cDNA EST EMB... 
>gi|3879984|gnl|PID|e 1 35 1 767 
suppressor protein-like repeats; 
cDNA EST EMBL.D33495 
comes from this gene; cDNA 
EST EMBL:D351 17 comes 
from this gene; cDNA EST 
EMBL:D36356 comes from this] 
gene; cDNA EST EMB. 



THYMIDINE KINASE 



6.7 



saimiriine herpesvirus 1 (strain 
HfQnc]) ><;i|60341 



6.7 



(U38184) ATPase subunit 6 
Trypanosoma cruzil 



(AL023862) hypothetical 
protein SC3F9.07 [Streptomycesl 
coelicolorl [ g.7 



(U64859) glutamine-rich protein! 
[Caenorhabditis ele°ansl I 6.7 

DYNE IN HEAVY CHAIN, 
CYTOSOLIC (DYHC) dyn'ein 
heavy chain 

Schizosaecharomyces pombel I 6^6 



(M5S520) endo-l,4-beta- 

lucanase [Fibrobacter 
succinogenesl 



6.6 



WO 01/02568 
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Nearest Neighbor rBlastN vs. Genbank) 



SEQ 

ID | ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant PrntPinO 



P VALUE I ACCESSION 



DESCRIPTION 



lArabidopsis thaliana 
[anthranilate synthase 
[alpha subunit gene, 
181 I M92354 complete cds. 



1.7 



182 



183 



184 



iHordeum vulgare 
[genomic DNA 
fragment; clone 
AJ234856 MWG2234.r ev 
IStercorarius 
I parasiticus bird J33 
(cytochrome b protein, 
U76827 partial cds 



738308 



blue light photoreceptor 
[Arabidopsis thalianal 



6.5 



I Saccharomyces 
[cerevisiae Ttplp 
(TTP1) gene, 
UQ521 1 complete cds. 



1.7 



1.7 



3142302 



3413810 



[Homo sapiens 
JTRRAP protein 
l(TRRAP) mRNA, 
185 | AF076974 complete cds 



1.7 



403173 



1.7 



1170140 



186 I AE000753 



[Aquifex aeolicus 
section 85 of 109 of 
[the complete genome 



1.7 



1169357 



187 I AF005638 



iTupaia glis 
lapolipoprotein AI 
[prepropeptide 
| mRNA, complete cds 



1.7 



3355682 



188 | M23090 



[Human germline IgK 
chain gene V3-region,| 
|clone Humkv328h5 



1.7 



2257483 



(AC00241 1) Strong similarity to 
myosin heavy chain gb|Z34293 
from A. thaliana. [Arabidopsis 
thalianal j 6.5 



(Y17034) Bassoon [Mus 
musculusl 



(L24492) lipoprotein 
[Rhodococcus e: 



rythropolisl 



PUTATIVE 
ENDOGLUCANASE TYPE K 
PRECURSOR (ENDO-1,4- 
BETA-GLUCANASE) 
(CELLULASE) 



DNA ADENINE METHYLASE| 
site-specific DNA- 
methyltransferase (adenine- 
specific) dam methylase gene 
product [Vibrio cholerael 



(AI.03I 124) putative secreted 
lyase 



189 I M24001 



iMink enteritis virus 
antigenic type 2 
Icapsid protein genes 
VP1 and VP2, 
[complete cds. 



H.sapiens CST4 gene j 
l9 0 I X59964 IforCystatin D 



1.7 



2143504 



1.7 



1766075 



(AB004534) pi003 
[Schizosaccharomyces pombe] 



myotonic dystrophy kinase - 
mouse (fragment) kinase, DM- 
kinase {C-terminal, alternatively! 
spliced, clone delta II.III.IV.V) 
[mice, brain. Peptide Partial, 
474 aa] [Mus sp." 



(U37273) winged helix protein 
CWH-2 [Gallus gallusl 



5.4 



4.9 



4.1 



4.0 



4.0 



4.0 



3.9 



3.1 
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SEQ 
ID 



Nearest Neighbor (BlasiN vs. Oenbank) 



ACCESSION 



DESCRIPTION 



191 



X95276 



P VALUE 



Nearest Ne.ghbor (BlastX vs. Non-Kedundant Pm^^T 



ACCESSION 



P.falciparum 
complete gene map of 
plastid-like DNA (IR- 

B) 

jRat PMSG-induced 



192 1 D84487 



193 



L 14851 



ovarian mRNA, 
3'sequence, N10 



1.7 



Rattus norvegicus 
neurexin Ill-alpha 
gene, complete cds. 



1.7 



1.7 



DESCRIPTION 



HYPOTHETICAL 11.7 KD 



IP VALUeI 



3219951 



173164 



3323586 



PROTEIN C6B12.13 IN " 

CHROMOSOME I 

gi|2330843|gnl|PID|e334047 
pombe] 



(J02719) valyl-tRNA syntheta 
[Saccharomyces cerevisiae] 



(AF060869) single-strand 
binding protein [Salmonella 
typhimuriuml 



194 I M97002 



Xenopus laevis/gilli 
hybrid pseudo-IgH 
chain gene, V region 
clone LG7G342A. 



1.7 



2118407 



MHC sex-limited protein - 
mouse (fragment) musculusl 



195 



L07025 



196 I S73149 



Eracrrms rntrrrrrgiensis 
delta-endotoxin 
(CryA(a)) gene. 5' 
end. > :: 

gb|I34520|I34520 
Sequence 1 from 
patent US 5596071 > 

gb|I39790|I39790 
Sequence 1 from 
patent US 5616495 > 

gb|AROOS487|AR008 
487 Sequence 1 from 
patent US 5753492 



insulin-like growth 
factor II (intron 7 J 
[human, Genomic,. 
1702 nt] 



1.7 



1.7 



2496940 



HYPOTHETICAL 53.4 KD 
PROTEIN D 1054. 13 IN 
CHROMOSOME V 
»gi|38753 1 6|gnl|PID[e 1 344967 



3327038 



(AB014512) KIAA0612 protein 
[Homo sapiens] - 



197 I D86990 



Human (lambda) 
DNA for 

immunoglobulin light 
chain 



1.7 



494367 



Fv Fragment (Murine Sel55-4) 
Complex With The 
Trisaccharide: Alpha-D- 
Galactose(l-2)[alpha-D- 
Abequose(l-3)]alpha- D- 
Mannose (Pl-Ome) (Part Of 
The Cell-Surface Carbohydrate 
Of Pathogenic Salmonella) 
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! Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor TBIastX vs. Non-Redundant Pinrpin^ 


SEQ 
ID 


ACCESSION 


r DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Plasmid pFdA (from 










198 


LI 7027 


Fremyella 
diplosiphon) DNA 
sequence, including 
unidentified cds and 
stem loop. 


1.7 


1082702 


poliovirus receptor-related 
protein - human 


1.4 


199 


AL022273 


Caenorhabditis 
elegans cosmid 
H22D14, complete 
sequence 
[Caenorhabditis 
elegans] 


1.7 


3924605 


(AF069442) putative inhibitor 
of apoptosis [Arabidopsis 
thaliana] 


1.4 


200 


U89926 


Drosophila 
melanogaster cut 
gene, partial sequence 


1.7 


{ 2245100 


(Z97343) DNA-binding protein 


1 T 
I. J 


201 


Z25749 


H.sapiens gene for 
ribosomal protein S7 


1.7 


2493459 


PROTEIN KINASE C 
SUBSTRATE, 60.1 KD 
PROTEIN, HEAVY CHAIN 
(PKCSH) (80K-H PROTEIN) 
>gi|1215746 


i.i 


202 


U59841 


Fundulus heteroclitus 
lactate dehydrogenase 
B 


1.7 


30055S7 


(AF048977) Ser/Arg-related 
nuclear matrix protein [Homo 
sapiens] 


0.S2 


203 


X55763 


Rabbit mRNA for 
smooth muscle 
calcium channel 
blocker (CaCB) 
receptor 


1.7 


3883128 


(AF082302) arabinogalactan- 
protein [Arabidopsis thaliana] 


0.82 


204 


Z75528 


Caenorhabditis 
elegans cosmid 
C18B12A, complete 
sequence 
Caenorhabditis 
elegans] 


1.7 


940397 


(D 10123) core [Hepatitis C 
virus] 


0.S0 


205 


U50912 


Human XIST gene, 
poly purine- 
jyrimidine repeat 
reeion 


1.7 


2338027 


(AF005370) large tegument 
protein [Alcelaphine herpesvirus 
1] 


0.59 


206 


X12817 


Ovis aries beta- 
actoglobulin gene 


1.7 


987050 


7C65335) lacZ gene product 
unidentified cloning vector] 


0.45 


207 


AF004419 


-lomo sapiens 
roponin T (TNNT2) 
aene. exon 13 


1.7 


2996364 


[AF053947) unknown [Yersinia 
aestis] >aiJ3883090 


0.22 


208 


L43643 


Gallus domesticus 
3NA microsatellite 
T.arkerMCWlI9 


1.7 


] 
t 

464S96 


rRANSDUCIN-LIKE 
ENHANCER PROTEIN 1 
:nhancer-of-split homolog TLE- 
i - human >si|307510 


0.20 



WO 01/02568 
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Nearest Neiehbor (BlasiN v« r,~„h-~i,\ 




1 SEQ 
1 ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


or (BlastX vs. Non-Redundant Pn 
DESCRIPTION 


Jteins) | 
P VALUE 



209 



273278 



[S.cerevisiae 
[chromosome XII 
Ircading frame ORF 
|ylR106c 



1.7 



1351657 



210 I M22345 



[Mouse endogenous 
Iprovirus gag, pol, and | 
|env region DNA. 



211 | AE000360 



(Escherichia coli K-12 1 
IMG 1655 section 250 
of 400 of the 
[complete genome 



1.7 



2444455 



1.7 



2736361 



PROTEIN C30D11.04C IN 
CHROMOSOME I 
>gi|2130411|pir||S62562 
hypothetical protein 
SPAC30D 1 1.4c - fission yeast 
nuclear pore complex protein 
Schizosaccharomyces pombej 



(AF020765) hypothetical 
protein [Myxococcus xanthusl 



(AF039038) No definition line 
found [Caenorhabditis elegansl 



212 | AB020692 



213 



S69429 



[Homo sapiens mRNA| 
for KIAA0885 
[protein, complete cds 
Itestis-determining 
Igene/SRY homolog 
[[Sminthopsis 
|macroura=striped- 
[faced dunnarts, 
iGenomic, 855 nt] 



1.7 



2605924 



214 



S69429 



Itestis-determining 
Igene/SRY homolog 
[[Sminthopsis 
|macroura=striped- 
[faced dunnans, 
[Genomic. 855 htl 



1.7 



2499016 



215 1 U67205 



IMus musculus ACF7 
[neural isofonn 3 
(mACF7) mRNA, 
I partial cds 



1.7 



2499016 



1.7 



2047349 



216 | X98188 



Artificial DNA 
[sequence for 
[mammalian lambda- 
Jneo minichromosomej 

1400 bp 



217 



|Mus musculus 
putative CCR4 
[protein mRNA. 
_U70I39 Ipanialcds 



1.7 



2493779 



1.7 



2252630 



(AF029726) histidine kinase C 
[Dictyostelium discoideum] 



TONB PROTEIN >gi| 1666536 
(U23764) TonB [Pseudomonas 
aeruginosa! 



TONB PROTEIN >gi| 1666536 
(U23764) TonB [Pseudomonas 
aeruginosa] 



0.20 



0.12 



0.12 



0.094 



0.092 



0.08S 



(AF000198) weak similarity to 
HSP90 [Caenorhabditis 



PUlATlVL CUTICLE 
COLLAGEN C09G5.6 
collagen; cDNA EST yk244c3.5 
comes from this gene; cDNA 
EST yk244c3.3 comes from this 
gene [Caenorhabditis elegansl 



elegans]! 0052 



0.04: 



(U95973) hypothetical protein 
[Arabidopsis thalianal 



0.041 
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SEQ 
ID 


Nearest Neighbor (BlastN v S ( 
ACCESSION | DESCRIPTION 


3enbank) j wearesi Neighbor fHl^k v, ni^.d-...-^ p, 
_ P VALUE! ACCESSION | DESCRIPTION 


oteins) 
P VALUE 


218 I L3880S 


IHomo sapiens alpha- 
1 1 type V collagen 
(COL5AI)gene. 5' 
[flank and e.xon 1. 


LZ I 2895760 


[(AHM5246) un.versal minicircle 
sequence binding protein 
minicircle sequence binding 
protein fCrithidia fasciculatal 




219 1 Z72151 


B.napus mRNA for 
AMP-bindina protein 


1-7 1 190475 


(K02576) salivary proline-rich 
protein I THomo sapiens | 


. 0039 


220 J X94152 1 


R.norvegicus mRNA 
for cysteine sulfinate 
decarboxylase 
Mouse stathmin gene 


1-7 1 •• 2136212 j 


synapsin lib - human 
>gi|1594277 (U402I5) synapsin 
lb [Homo sapiens] 


0.01 1 j 
0.008 1 



L 13600 



iRattus norvegicus 
[glycine transporter 
[mRNA. complete cds 



AJ224150 



[Plasmodium berghei 
[EF-lalpha A-gene 



2317934 [herpesvirus 68] 



0.006 



1.7 



726403 



224 | S80642 



225 



226 



227 



228 



M22363 



butyrophilin [mice, 
[lactating mammary 
gland, mRNA Partial, 
3193 nt 

[C.elegans unc-86 
gene encoding two 
alternative proteins. 
complete cds. 



[(U23 175) similar to anion 
[exchange protein 
[[Caenorhabditis eleeansl 



1.7 



2072290 



[(U95094) XL-INCENP 
[[Xenopus laevis] 



0.003 



0.001 



[M.musculus cgt gene 
X92123 exon 1 

[ipomoea nil PKn2 
j(knotted-like gene) 
AB016000 [mRNA, complete cds 



D14I33 



Bovine mRNA for 
Isynaptocanalin I 



1.7 



1.7 



2695746 



2224683 



KAJ223010)Pmt2 

[Schizosaccharomyces pombel | 9e-04 

(AB002369) KIAA0371 [Homol 

jsapiensl [ le .04 



1.7 



1.7 



3874232 



2183083 



[(Z49909) similar to Prokaryotic 
[ri bo nuclease PH 
[Caenorhabditis elegansl 



1.7 



392527.7 



[(AF000422) TTF-I interacting 
peptide 5 [Homo sapien sl 
(AL0326? J>) similar to 
[Uncharacterized protein family 
UPF0034, Double-stranded 
[RNA binding motif; cDNA EST 
yk489b3.5 comes from this 
gene; cDNA EST yk439g7.5 
[comes from this gene 
[[Caenorhabditis elegans] 



3e-05 



le-05 



2e-06 
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SEQ 
| ID 


' 1 Neares 
1 1 

Iaccessio 


t Neighbor fBIastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neier 


bor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION J P VALUE 


229 


I L01991 


Mus musculus TAFC 
1-Iike neuronal 
glycoprotein (PCS) 
mRNA. complete cds 


1 1 1 

.] 1. / 


i 3006139 


(AL022299) hypothetical 
protein 


I 4e-07 


230 


| tomato yellow lear 
I Icurl virus Thailand 
1 lisolaie complete 
I genome (TYLCV-TH 
i X63016 B-DNA) 


1 I. / 


| 3643608 


(AC005395) hypothetical 
protein [Arabidopsts thaliana] 


le-07 


231 


j [H.sapieris 

microsatel lite repeat. 
j> :: 

gb|G34562|G34562 
| human STS SHGC- 
i Z22802 51834 


1.7 


| 100210 


extensin precursor (clone Tom L 
4) - tomato esculentum] 


4e-09 


232 


K02765 


Human complement 
component C3 
ImRNA, alpha and 
beta subunits, 
complete cds. 


1.7 


2984320 


(AE000773) acetoin utilization 
protein f Aquifex aeolicus] 


le-09 


233 j 


Z74818 


S.cerevisiae ] 
chromosome XV 
reading frame ORF 
YOL076w 1 


1.7 1 


3873700 


(Z./JIU2) predicted using 
Genefinder; Similarity to 
Bacillus subtilis DNAJ protein 
gene; cDNA EST 
EMBL.C 12520 comes from this 
gene; cDNA EST 
EMBL.-D71409 comes from this 
ge... 


7e-ll 


234 I 


D21871 I 


Pig mRNA for thimet 
oligopeptidase J 


1.7 i 


( 

2632098 


Y 155 13) Prodos protein 
Drosophila melanosaster] 


8e-13 


235 1 


< 

|e 

Y 14344 e 


jallus gallus gene I 
ncoding neurofascin.l 
xons 9,10.11 & 12 


1.7 | 


I 

1 
c 
E 
t 

3876421 e 


^1070) cDN A EST — 

iMBL:C12730 comes from this 
ene; cDNA EST yk200b6.5 
omes from this gene; cDNA 
'.ST yk349al2.5 comes from 
lis gene [Caenorhabditis 
legans] 


3e-14 


236 1 


J 
|c 
In 

Z73608 K 


>.cerevisiae 
hromosome XVI 
eading frame ORF 
'PL252c 


1.7 1 


( 
P 

1439663 e 


J64605) C05D9.6 gene 
roduct [Caenorhabditis 
egansl 


6e-18 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












OLIGOSaCCHARYL 




237 


AGO0O518 


Homo sapiens 
genomic DNA, 21q 
region, clone: 
T171N23 


1.7 


1 174468 


TRANSFERASE STT3 
SUBUNIT HOMOLOG 
>gi|529357 (U13019) No 
definition line found 
[Caenorhabditis elegans] 


6e-18 


238 


D17716 


Human mRNA for N- 
acety 1 gl uc osami ny 1 tra 
nsferase V, complete 
cds 


1.7 


961446 


(D63877) KIAA0157 gene 
product is novel. 


5e-19 


239 


AF102512 


cneiiodacryius 
victatus country USA: 
Midway Island 
cytochrome c oxidase 
subunit I gene, 
mitochondrial gene 
encoding 
mitochondrial 
protein, partial cds 


1.7 


1572756 


(U70848)C43G2.1 gene 
product [Caenorhabditis 
elegans] 


5e-40 


240 


L30107 


Rattus norvegicus 
liver-specific 
transporter gene, 
promoter region. 


1.7 


4176443 


(AL022238) dJ1042K10.4 
(novel protein) 


3e-49 


241 


X91220 


H. sapiens mRNA for 
Na-Cl electroneutral 
th iazide-sensi live 
cotransporter 


1.7 


3478637 


(AC005546) R29425_l [Homo 
sapiens] 


6e-54 


242 


U97146 


Rattus norvegicus 
calcium-independent 
phospholipase A2 
mRNA, complete cds 


1.6 


cNONE> 


<NONE> 


<NONE> 


243 


Z48508 


Pea seed-borne 
mosaic virus RNA for 
coat protein and 
polymerase (partial) 


1.6 


<NONE> 


<NONE> 


<NONE> 


244 


Ml 8349 


Rat leukocyte 
common antigen (L- 
CA) gene, exons I 
hrouuh 5. 


1.6 


<NONE> 


<NONE> 


<NONE> 


245 


_ 

M13158 ( 


Yeast (S.pombe) 
;dc25+ gene (mitosis 
nitiation). complete 
:ds. 


1.6 


<NONE> 


<NONE> 


<NONE> 




WO 01/02568 
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SEC 
ID 


Nearcsl 

I 

ACCESSIOt 


Neighbor (BlasiN vs. 
*l DESCRIPTION 


Gen bank) 
P VAI tJF 


Nearest Neish 

ttLLCoo 


bor (BlastX vs. Non-Redundant P 

r*\c c /~"T"* nyri«x r 
UhSCRIPTION 


rnfpinc > 

P VALUE 


246 


1 U39712 


Mycoplasma 

OPnitlliiim c/v'MAn ti 
gdllullllUTl oCClIun J*1 

of 51 of the complete 
genome 


* 

1.6 


<NONE> 


<NONE> 


<NONE> 


247 


1 Ml 7922 


Mouse Murine 
urokinase-type 
plasminogen accivato 
protein gene, 
complete cds. 


r 

1.6 


3875750 


(*.siwy) predicted using 
Genefinder; cDNA EST 
yk410e3.3 comes from this 
gene; cDNA EST yk410e3.5 
comes from this gene 
Caenorhabditis eleaans] 


8.0 


248 


1 M89986 


Human polymorphic 
loci in Xq28. 


1.6 


3261710 


(Z84724) psd [Mycobacterium 
tuberculosis] 


6.4 


249 


1 M89986 


Human polymorphic 
loci in Xq28. 


1.6 


2143805 


inositol-polyphosphate 4- 
phosphatase - rat 


6.2 


250 


! U68725 


Rattus norvcgicus 
Deleted in colorectal 
Cancer 


1.6 


1256804 


(U51449)RING3 protein 
[Xenopus laevis] 


5.8 


251 


X95199 


i .piaiessa uo I r\, 

GSTA1, GSTA2, and 
PPTN genes 


1.6 


3915113 


MALEYLACETATE 
REDUCTASE Pseudomonas 
cepacia >gi|643636 (U19883) 
maleylacetate reductase 
[Burkholderia cepacial 


4.9 


252 | 


Y09 103 


D.melanogaster 

17 P A 1 nana 

ivrAl gene 


1.6 


3916021 


HYPOTHETICAL 91 Kb 
PROTEIN IN COB HNTRON 
>gi|2654230|gnl|PID|e 1192341 
(X02819) unidentified reading 
frame [Schizosaccharomyces 
pom be] 


4.8 


253 1 


214078 


T.aestivum 
mitochondrion fMet, 
18S, 5S repeat unit 
DNA 


1.6 


2501668 


DYSTROPHIN-RELATED 
PROTEIN 2 sapiens] 


3.6 


254 I 


I 
i 

AB002314 c 


luman mRNA for 
CIA A03 16 gene, 
ompletc cds 


1.6 


1 
( 
1 

r 

130997 r 


REPETITIVE PROLINE-RICH 
:ELL WALL PROTEIN 1 
PRECURSOR 

>gi|81S09|pir||A29324 proline- 
ich protein precursor - soybean 
»gi|170049 (J02746) proline- 
ich protein [GIvcine max] 


2.8 


255 | 


F 

c 

M21488 ( 


[uman muscle 
reatine kinase gene 
:KMM). exon 2. 


1.6 


t 
F 
F 
C 

119399 P 


NV POLYPROTEIN 
RECURSOR (COAT 
OLYPROTEIN) [CONTAINS: 
OAT PROTEIN GP62: COAT 
ROTEIN GP40] 


2.2 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins') 



SEQ 
ID 



ACCESSION 



AE 00 11 64 



DESCRIPTION I p VALUE | ACCESSION 



DESCRIPTION 



Borrelia burgdorferi 
(section 50 of 70) of 
the complete genome 



1.6 



4050089 



(AF109907) hypothetical 
protein (Homo sapiens] 



|P value! 



X61757 



M.musculus 
rearranged T-cell 
receptor beta variable | 
region (Vbl7a) 



M 15346 



T.cruzi tandemly 
repeated gene 
encoding an 85 kDa 
antigen with 
homology to heat 
shock proteins. 



1.6 



3377766 



(AF080090) semaphorin IV 
isoform b [Mus musculus] 



1.6 



2804437 



(AF043695) similar to zinc 
metalloprotease family of 
peptidases [Caenorhabditis 
elegansl 



1.2 



L39018 



Rattus norvegicus 
sodium channel 
protein 6 (SCP6) 
mRNA, complete cds | 



1.6 



2920535 



(AF0 18081) type XVIII 
collagen [Homo sapiens] 



M29483 



Human leukocyte 
adhesion protein 
p 1 50.95 alpha subunit( 
gene, exons 7-15. 



1.6 



1840045 



L06844 



Aspergillus niger beta] 
D-fructofuranosidase 
(sucl) gene, one 
exon. 



1.6 



4206210 



(U49082) transporter protein 
[Homo sapiens] 



(AF071527) putative calcium 
channel [Arabidopsis thaliana] 



0.037 



2e-09 



9e-10 



M 10946 



Chicken aldolase B 
gene, complete cds, 
clones lambda- 
COl. 1.4). 



1.6 



2746775 



(AF040640) similar to peptidase 
family C19 (ubiquitin carboxyl- 
terminal hydrolase) 
[Caenorhabditis elegansl 



le-31 



X07881 



Human gene PRB3L 
for proline-rich 
protein Gl 



1.5 



<NONE> 



<NONE> 



<NONE> 



U22260 



Nicotiana tabacum 
UMP synthase (pyr5- 
6) mRNA. partial cds i 



1.5 



U76759 



Mus musculus 
nuclear protein 
NIP45 mRNA. 
complete cds 



3880923 



(Z99271) similar to Reverse 
transcriptase comes from this 
gene [Caenorhabditis elegans] 



0.50 



1.4 



1330394 



(U58761) C01FI.6 gene product 
Caenorhabditis elegans] 



8.9 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest 
ACCESSIOf 


Neighbor (BlastN vs. ( 
i DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


DESCRIPTION 


roteins) 
P VALUE 












PUlAisSlUM- 




266 


AF076470 


Rice tungro 
bacilliform virus 
Serdang strain, 
complete genome 


1.4 


1703461 


TKANiJPORTlNG ATP ASK 
BETA CHAIN (PROTON 
PUMP) (GASTRIC H+/K+ 
ATPASE BETA SUB UNIT) 
3.6.1.36) beta chain - human 
>gi|184105 (M75U0) H.K- 
ATPase beta subunit [Homo 
sapiens] 


8.9 


267 


X64659 


C.jacchus interferon 
gene for interferon 
gamma 


1.4 


' 1486485 


(U28832) US10[Gailid 
herpesvirus 1] >gi|1486497 


6.8 


268 


U11825 


Schistosoma 
japonicum structural 
muscle protein 
paramyosi n mRN A, 
complete cds. 


0.88 


<NONE> 


<NONE> 


<NONE> 


269 


D84278 


Human DNA for 
CD38. exon 1 


0.68 


3766363 


(AL0319O7) hypothetical serine 
rich protein 

[Schizosaccharomyces pombe] 


3.0 


270 


M59755 


Bovine lens aldose 
reductase 

pseudogenc, 3' end. 


0.67 


<NONE> 


<NONE> 


<NONE> 


271 


M81758 


Homo sapiens 
skeletal muscle 
voltage-dependent 
sodium channel alpha 
iubunir f^kMl 1 
mRNA. complete cds. 


0.65 


2437819 


(Z86105) 1,4-beta-glucanase 
Anaerocellum thermophilum] 


3.6 


272 


« 

L01965 | 


luman type IV 
odium channel alpha 
wlypeptide 


0.64 


■ 

2437819 


Z86105) 1,4-beta-glucanase 
Anaerocellum thermophiluml 


3.5 


273 


1 
r 

f 

U90I22 r 


3anio rerio bone 
norphogenetic 
•rotein-4 (bmp4) 
nRNA, partial cds 


0.63 


( 
c 

2983532 f 


AE000720) formate 
lehydrogenase alpha subunit 
Aquifex aeolicus] 


7.9 


274 


I 

( 

L41624 1 


lylobates lar mucin 
vlUCl) gene, exons 
-6. 


0.63 


( 

15I780S r, 


D79215) FGF-10 [Rattus 
orvegicus] 


0.91 



"5^9 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Neare 
ACCESSIC 


st Neighbor (BlastN vs 
)N DESCRIPTION 


- Genbank) 
P VALUE 


[ Nearest Neit 
1 ACCFS^inM 


jhbor (BlastX vs. Non-Redundant 
I DESCRIPTION 


Proteins) 1 

__ p value! 


275 1 


AF030881 


Fugu rubripes sushi 
retrotransposon gag 
polyprotein (gag) an 
pol polyprotein (pol, 
genes, complete cds 


d < 
0.63 


1519696 


(U6/y:>o; uuded for by (j; 

elegans cDNA ylcl26t*>.5; cot! 
for by C. elegans cDNA 
ykl59h6.3; coded for by C. 
elegans cDNA ykl26f9.3; cod< 
• for by C. elegans cDNA 
ykl59h6.5 [Caenorhabditis 
elegans] 


53 " 1 
xi 1 


276 1 


U52909 


Arabidopsis thaliana 
Ul snRNP 70K 
protein gene, 
complete cds 


0.62 I 


' <NONE> 


<NONE> 


0.38 


277 I 


AF008192 


Homo sapiens 
putative GR6 protein 
(GR6) mRNA, 
complete cds 


062 j 


3800934 


(AF100655) contains similarity 
to ser/thr protein kinases 
Caenorhabditis elegans] 


<NONE> j 
9.7 I 


278 I 


U17081 


Human fatty acid 
binding protein 
(FABP3) gene, 
complete cds 


0.62 I 


3617848 


(AF049709) tyrosylprotein 
sulfotransferase-A; TPST-A 


7.7 1 


279 1 ABO 18340 


Homo sapiens mRNA 
for KIAA0797 
protein, partial cds 


062 j 


424044 


VP5 protein - porcine rotavirus 
>gi|61355 


7.7 1 


280 | Y00093 


H.sapiens mRNA for 
eukocyte adhesion 
glycoprotein pi 50,95 


0.62 1 


1054945 


(U38621) polyprotein [Tobacco 1 
vein mottling virus] 1 


4.5 1 


1 I 

( 

. 281 1 M63138 s 


iuman cathepsin D 
catD) gene, exons 7, 
, and 9. 


0.62 I 


1 

t 

136810 > 


aLYCOPRUl'fcIN M f 

>gi|73791|pir||WMBE5l UL10 
Jrotein - human herpesvirus 1 1-1 
173) [Human herpesvirus 1J 1 
>gi|221732l2nl|PID|dl002131 




1 ^ 
1 S 

1 b 

282 1 X76056 ri 


. a/ivesins LJIN A ror 
jacer region 
itween 25S and 1 8S 
bosomal RNA aenes 


0.62 I 


( 

2661176 n 


J7667I) putative cds 1 
thodobacter sphaeroides] | 


3.5 j 


1 B 

283 f X7450I A 


taurus mRNA for 
2TH receptor 


0.62 I 


4249552 pj 


^B001075) galectin-2 related 
otein 1 


2.0 j 


1 Rs 
1 su 

284 | M57634 en 


i FI-ATPase beta 
bunit mRNA. 3' 
d. 


0.62 J 


tr 

( y 
>k 

tra 

2119692 tvj 


insforming growth factor-beta 1 
pe III receptor - chicken 
i|5l lS43 (L0U21) 
nsforming growth factor-beta 
se III receptor [Gallus eallus] I 


2.0 j 
1.5 | 



3w 



WO 01/02568 



PCTYUS00/18374 



.Nearest Neighbor (BlastN vs. Gtmh.mif i 



Nearest fNe.rtihpr (BlastX vs. Non-Redundant Pr^TT 




(U0S884) protein VIII precursor 
[Bovine adenovirus type 31 | 7.6 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID I ACCESSION 



Nearest Neighbor (BlastN vs. Genbanlc> 



DESCRIPTION 



292 



U70825 



293 | L81667 



294 I AE 000760 



295 



U58512 



296 I U27459 



Nearest Ne.ghhor. BlastX vs. Non-Redundam Pro.ein.T 
P, VALUE I ACCESSION | nFSr-pnrrT^, 



Rattus norvegicus 5- 
oxo-L-prolinase , 
mRNA. complete cds 



Homo sapiens 
(subclone 2_a9 from 
"1 H49) DNA 
sequence 



0.61 



733543 



0.61 



2565087 



Aquifex aeolicus 
section 92 of 109 of 
the complete g enom e | 0.61 

Mus musculus Rho- 
associated, coiled- 
coil forming protein 
kinase p 160 ROCK- 1 
mRNA. complete cds I 0.61 
hiuman origin 
recognition complex 
protein 2 homolog 
hORC2L mRNA, 
co mplete cds ] 0.61 



2811092 



P value! 



(U23448) similar to genome 
polyprotein 

(SP:POLG_B VD VN, P 1 97 1 1 ) ; 
alternative splicing to C04A2.7al 



4.4 



(U80759) CAGH4 alternate 
open reading frame [Homo 
sapiens] 



3.3 



297 | L36680 



298 | AE 000673 



299 | AF08631O 



300 I AJ009675 



301 I AC005577 



Pisum sativum S- 
adenosylmethionine 
synthase mRNA, 3' 
end. 



295671 



200285 



HOMEOBOX PROTEIN HOX- 
A3 (HOX-1.5) homeobox- 
containing transcription factor 
Mus musculusl | 2.6 

(L U 275) selected as a weak 
suppressor of a mutant of the 
Isubunit AC40 of DNA 
'dependant RNA polymerase I 
and III ( j 5 



j(M97900) putative open reading 
[frame fMus musculusl I 0.66 



0.61 



(AB002086) p47 [Rattus 
2285790 Inorvegicusl 



4e-I2 



Aquifex aeolicus 
section 5 of 109 of 
the complete genome 



Homo sapiens full 
length insert cDNA 
clone ZD51F08 



Agrotis ipsilon 
mRNA for 3-hydroxy 

methylglutaryl 
coenzyme A 
reductase 



Homo sapiens 
chromosome 19. 
cosmid F18382B, 
centromeric end, 
omplete sequence 
omo sapiensl 



JH, 



0.61 



(AF058446) histone 
3395782 macroH2A1.2 [Gallus gallusl I 6e-27 



0.61 



(AL031603) conserved 
[hypothetical protein. 
3646450 fSchizosaccharomyces pombe] | 8e-29 



0.61 



4176370 



(AC005058) similar to calcium- 
lindependent phospholipase A2; 
Similar to AC004392 
(PID:g33675I9) [Homo 
Jsapiens] 



2e-73 



0.60 



cNONE> 



<NONE> 



! <NONE> 
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SEQ 
ID 



Nearest Neighbor fBlastN vs. Genbank) 



ACCESSION I DESCRIPTION 



[Candida albicans 



Jiopoisomerase type I 
(CATOP1) gene, 
302 I U4Q454 complete cds 



Nearest tNe.ghhnr (BlastX vs. Non-Redundant P™;^ 



P VALUE I ACCES.srrm 



0.60 



DESCRIPTION 



P VALUE! 



<NONE> 



303 | JQ 1390 



304 1 m 172 



305 I Z81079 



306 



Z49627 



lEmericella nidulans 

ImtDNA between 
h2/h5 and bh2/b2 

fjunctions. genes for 

lATPase subunit 6, 

[cytochrome oxidase 

[subunit 3, seven. 

[unidentified proteins, 

Itwentyfour tRNA's 

land L-rRNA. |_ 0.60 

[Plasmodium 
I falciparum RNA 
[polymerase I gene, 
I complete cds. |^ 0.60 

1'aenorhabditis 
Jelegans cosmid 
[F39H1 1, complete 
[sequence 
[Caenorhabditis 

lelegans] (_ 0.60 

S.cerevisiae 
chromosome X 
reading frame ORF 
YJR127c { 0.60 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



307 | U94911 



308 | U67476 



309 I U67513 



310 



U57817 



Rattus norvegicus H- 
K-ATPase alpha 2 
gene, alternatively 
spliced products and 

partial cds 

Methanococcus 



<NONE> 



118751 



<NONE> 
[MAJOR DNA-BlNDINi 
PROTEIN herpesvirus 1 (strain 
11) >gi|60327 (X64346) major 
IssDNA-binding protein 
[[Saimiriine herpesvirus 21 



jannaschii section 18 
of 150 of the 
complete genome 



Methanococcus 
jannaschii section 55 
of 150 of the 
complete genome 



0.60 



(AF003086) PfSNF2L 
2213862 [Plasmodium falciparum! 



<NONE> 



9.6 



0.60 



((D89240) unnamed protein 
1749688 product 



Haemophilus ducreyi 
lipoprotein gene, 
complete cds 



0.60 



3327421 



0.60 



4008577 



(U97068) zonadhcsin [Mus 
musculus] 



(AL034491) conserved 
hypothetical protein 
[Schizosaccharomyces pombel 



7.4 



5.7 



2.5 



V CI 



WO 01/02568 
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I SEQ f 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



barest Neighbor (BlastX vs. Nnn.R.H„ ndant Pro[einsr 



_ACCESSIOn1 DESCR rP TTDNf p VALUE Arrv**,™ 

r VALUb I ACCESSION | DESCRTPTTDM 



311 I X807nn 



31: 



31: 



L42167 



. U54777 



H.sapiens G17 gene 



Mus musculus (clone 
R24) rds gene, partial 
cds 



314 I D8698S 



Human hMSH6 
mRNA. complete cds 



Human mRNA for 
KIAA0232 gene, 
complete cds 



0.60 



422541 



0.60 



4220848 



irobablc pr otein-tyrosine kin .n« 
l(EC 2.7.1.112) RTK-Pacir.c' 
electric ray >gi|290858 



0.60 



2665637 



0.60 



315 I D43964 



316 I U490'iS 



317 | X843SS 



318 [ AF 125447 



Rat liver mRNA for 
Kan-1, complete c ds 
Rattus norvegicus 
CTD-binding SR-Iike 
protein rA4 mRNA 

partial cds 

U.ruddi ~ 
mitochondrial I2S 
ribosomal RNA 



Caenorhabditis 
elegans cosmid 
Y14HI2B 



0.60 



0.60 



0.60 



319 | U20189 



320 



321 



M63962 



A J 132366 



[Hyoscyamus muticus 
clone cVS2 
vetispiradiene 
synthase mRNA 
partial cds 



Human gastric H,K 
ATPase catalytic 
subunit gene, 
complete cds. 



Helicobacter pylori 
(strain PI) comB and 
pmi/algA (partial) 
genes, and partial 
ORF1 and ORF2 



0.59 



0.59 



0.59 



0.59 



1938462 



(AF033823) moira [Drosophila 
melanogasterl 



(AF031087) mismatch repair 
protein MSH6 [Mus musculusl 



(U97006) No definition line 
found fCaenorhabditi s elegans] 
(IbDj'M) coded tor by C.' 



1280135 



elegans cDNA cm21e6; coded 
for by C. elegans cDNA 
cm01e2; similar to melibiose 
carrier protein 

(thiomethylgalactoside permease 

in 



[p value! 



1.5 



2e-07 



(U3750O) RNA polymerase II 
largest subunit fMus musciiln.;] 



5e-I5 



le-19 



3874247 



<NONE> 



(Z70205) predicted using 
Gcnefinder 



2e-3' 



<NONE> 



I <NONE> I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> j 



<NONE> 



<NONE> 



<NONE> 



I <NONE> | 



3-, H 



WO 01/02568 PCTAJS00/18374 



SEQ 
ID 



Nearest Neighbor (BlaslN vs. Genbank) 



ACCESSION 



DESCRIPTIO N 
Mus musculus 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



322 1 U I 7289 



323 



Z71466 



324 | Z66493 



transcription factor 
AP-2 (AP-2) gene, 
alternative exon la, 
and isoform 2, partial | 
cds. 



cerevisiae 
chromosome XIV 
reading frame ORE 
YNL190w 



(AC002332) hypothetical 
0.59 | 2459419 Iprotein TArabidopsis thali 



anal 



059 



3875542 



(Z67990) Similarity to Rat 
lamiloride-sensitive sodium 
[channel beta-subunit 



Beet soil-borne virus 
genes for 13K, 22K 
and 48K proteins 



jcry V465 protein - Bacillus 
059 ' 2119867 Ithu ringiensis thuringiensisl 



P VALUE! 



9.4 



7.3 



325 I L41351 



Homo sapiens 
prostasin mRNA, 
complete cds 



326 I X79854 



S.lincolnensis gene 
for 16S ribosomal 
RNA 



0.59 



729212 



CRYSTALLIN J1C crystallin 
[[Tripcdalia cystophoral 



327 I AJ223356 

328 | X86019 



Strongylocentrotus 
purpuratus mRNA for 
SuDp98 protein 



0.59 



(AF056577) high mobility 
370 2828 [group protein 1.2 



H.sapiens mRNA for 
PRPL-2 protein 



0.59 



2495704 



HYPOTHETICAL PROTEIN 
|KIAA0129 product is novel. 
[Homo sapiens] 



l(Y 10027) transcription factor 
0.59 | 1743341 |TEF-1 [Mus musculusl 



329 I U75528 



Xiphias gladius 
creatine kinase gene, 
partial cds 



0.59 



1845995 



330 I AC005573 



331 L19180 



332 



L32090 



Homo sapiens 
chromosome 5, PAC 
lone 202eI3 



Rat receptor-linked 
protein tyrosine 
phosphatase 



0.59 



2506366 



0.59 



1235974 



Listeria 

monocytogenes secA 
gene, complete cds. 



0.59 



2291129 



EPSILON SUBUNIT B DNA- 
Idirected DNA polymerase (EC 
(2.7.7.7) II chain B - yeast . 

(Saccharomyces cerevisiae) 
!>gi|786319 (U25842) DNA 
polymerase epsilon, subunit B 
(Swiss Prot. accession number 
P244S2) (Saccharomyces 
cerevisiael 



(X967I3) collagen [Globodera 
pallidal 



(AF016415) No definition line 
found [Caenorhabditis degans] 



4.2 



3.2 



2.5 



KU69477) envelope glycoprotein! 

l[Human immunodeficiency virus! 

;ype U 2.4 

l^NA PULViVlbKAib 



1.4 



1.1 



0.83 



WO 01/02568 
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SEQ 
ID 


Neares 
ACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


.Nearest Neighbor fBlastX vs. Non-Redundant 1 
ACCESSION | DESCRIPTION 


Voteins) | 

p valueI 


333 1 U24433 


_ Xenopus laevis 
syndecan-2 mRNA, 
complete cds. 


I 0.59 1 3355692 


(Al-UJl ill) hypothetical 
protein SClC2.25c 
MStreptomyces coelicolor] 


1 0.64 1 


334 I M23412 


Drosophila 
muscarinic 
acetylcholine receptc 
mRNA, complete cds 


r| I 

0.59 I 168237 


(M76546) hydroxyproline-rich 
Iprotein [Helianthus annuusl 


j 0.22 | 


335 j AF060729 


Synaphea media 
chloroplast atpB-rbcL 
intergenic spacer 
region, panial 
sequence 


0.59 j 731596 


H>PUlHLllLALb/.3KD 

PROTEIN IN PRPS4-STE20 
INTERGENIC REGION 
>gi|626567|pir||S46825 
hypothetical protein YHLOIOc - 

yeast (Saccharomyces 

cerevisiae) >gi|2289881 

(Ul 1582) No definition line 

found [Saccharomyces 

cerevisiae] 


0.16 1 


1 

336 1 AF029734 


Xanthobactcr 
autotroph icus 
transcriptional 
activator AldR (aldR) 
gene, panial cds; and j 
NAD-dependent j 
chloroacetaldehyde 
dehydrogenase (aldB)l 
gene, complete cds 1 


1 1 

0.59 j 2498801 k 


'ERIAXIN 

»gi|2143901|pir||I58157 periaxin 
rat >gi|505297 (Z29649) 
eriaxin [Rattus norveaicusl 1 


0.13 1 


1 C 

337 1 X95307 1 


'.reinhardlii LI818r- 1 
gene 1 


| IF 

1 r 
1 r 
I l h 

I V 

I l c 
1 r 

V 7 

0.59 1 1723781 !(<; 


TTPU 1 HE 1 1L AL 34.^3 KL> \ 

ROTE IN IN TAF145-YOR1 
NTERGENIC REGION 
gi|2131717|pir||S646l2 
ypothetical protein YGR277c - 
east (Saccharomyces j 
:revisiae) 

gi|1323505|gnl|PID|e243248 
^73062) ORE YGR277c j 
accharomvees cerpvUinoi 1 


le-04 


1 E 

1 d 
I G 

338 I M24572 v 


tayostelium [ 
scoideum tRNA- 1 
lu-GAA gene, clone 1 
jluGAAS. j 


0.59 1 1176186 


HYPOTHETICAL 43.3 KD 
GTP-BINDING PROTEIN IN 
DACB-RPMA INTERGENIC 
REGION >si|606121 colil 


3e-06 1 


1 H 

339 | U73733 e, 


uman hMSH6 gene, 
on 2 1 


059 1 2665637 


(AF0310S7) mismatch repair 
protein MSH6 [Mus musculusl 


5e-07 1 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION! DESCRIPTION 



P VALUE 



Nearest i.emnoprWastX vs. N on-Redundant Prn^ nTT 
ACCESSI ON DESCRIPTION 



340 



D90747 



genomic DNA. (25.2 - 
[25.6 min) 



341 



J052 1 1 



I Hum an desmoplakin 
ImRNA. 3' end. 



0.59 



134286 



0.59 



246796 



342 



L24441 



iLoligo pealii kinesin 
[light chain mRNA, 
[complete cds. 



343 | M25140 



(Human cardiac alpha- 1 
[myosin heavy chain 
j(MYH6) gene, exons 
12. 3 and 4. 



0.59 



547800 



0.58 



Homo sapiens 
(subclone 9_h2 from 
PI H2I) DNA 

sequence | 0.58 

Homo sapiens full 
length insert cDNA 
AF087966 Iclo ne YU5 1 CM [ 0.58 



344 | L8I932 



345 



346 



IH.sapiens flow-sorted | 
[chromosome 6 TaqI 
[fragment, 
Z78574 SC6pA10GIl 

[BlatteTia germanica 



0.58 



lallatostatin 
[neuropeptide 
[precursor, aene, 
347 J AF068061 complete cds 

|Homo sapiens Cdc7 



0.58 



348 



(CDC7) mRNA. 
AFO 15592 complete cds 



0.58 



349 



[Methanosarcina 
[barkeri atp operon: 
ATP synthase beta 
subunit (atpD), ATP 
[synthase epsilon 
subunit (atpC), ATP 
[synthase gene 1 
(atpl). ATP synthase 
AF028006 [a subunit subunit (... 



0.58 



350 



[Mus musculus gene 
[for pancreatic trypsin, 
ABO 17032 Iconiplete cds 



<NONE> 



DOLICHOL KINASE 



major centromere protein, 
CENP-B [human. Peptide, 594 

aa] 

KJNHSIN LIGHT CHAIN 

(KLC) sea urchin 
(Strongylocentrotus purpuratus) 
>gi|161530 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



3184291 



0.58 



3170561 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



(AC004136) putative DNA 
polymerase HI gamma subunit 



7. 



(AF056704) synapsin Ilia 
[Rattus norvegicus] 



4e-08 



5e-14 



<NONE> I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



9.4 



9.2 



WO 01/02568 



PCT/US00/18374 



SE( 
ID 


Nearest Neighbor < BlastN vs. 
} 1 

ACCESSION | DESCRIPTION 

T 1 1 li/*t\//\erdl inm 


Gen bank) 
P VALUE 


Nearest Neiph 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


Voteins) 1 
P VALUE 


351 


I AF081585 


Jdiscoideum 
(developmental 
[protein DGII10 
j(DGl 1 10) gene, 
partial cds 


0.58 


1 105417 


[basic proline-rich peptide IB-8a 
Ihuman 


9.2 I 


352 


1 AF086322 


[Homo sapiens full 
[length insert cDNA 
clone ZD53E01 


0.58 


I 93026 


[hypothetical protein - African 
[swine fever virus (strain Malawi 
Lil-20/l) >gi|450758 (X71982) 
[myeloid differentiation antigen 
[homologue [African swine fever 
|virus] >gi|903686 (M95672) 
lunknown Drotein 


7.1 1 


353 


1 AF088025 


IHomo sapiens full 
[length insert cDNA 
Iclone ZC19C04 


0.58 


1 (U92805) thrombospondin-3 
1 2384644 IfXenontis laevUl 


7.0 j 


354 


AB0O2339 


iHuman mRNA for 
KIAA0341 gene, 
partial cds 


0.58 


' 2135587 


Ml 30 antigen (cytosolic variant 
2) - human 


5.4 1 


355 


U67548 


Methanococcus 
jannaschii section 90 
of 150 of the 
complete senome 


0.58 


291 1094 


(AL021957) hypothetical 
protein Rv2174 


4.2 1 


356 1 


L07868 


Homo sapiens 1 
receptor tyrosine 1 
kinase (ERBB4) 
gene, complete cds. 


0.58 1 


1 

461922 : 


PVKUVAlk 

DECARBOXYLASE (8-10 NM 
CYTOPLASMIC FILAMENT- 
ASSOCIATED PROTEIN) 
(P59NC) 4.1.1.1)- Neurospora 
:rassa >gi|293948 (L09125) 
pyruvate decarboxylase 
Neurospora crassa] 
>gi| 1655909. 


4.2 j 


357 1 


1 

1 
( 

r 

X03897 I 


Bacillus subtilis ) 
>igma 43 operon with 
-*23-dnaE-rpoD genes| 
dnaE for DNA 
>rimase, rpoD for | 
INA polymerase) | 


0.58 1 


( 

r 

It 

1323704 1 


U55387) similar to C. elegans 
; 38E 1 .9 gene product encoded 
y GenBank Accession Number 
J41996 [Cricetulus griseus] 


4.1 


358 1 


I 
Iv 
Id 

r 

[r 

D76419 |c 


)esulfovibrio ' 
ulgaris rbo gene for | 
esulfoferrodoxin and 
lib gene for 1 
jbredo.xin, complete 1 
ds f 


0.58 | 


\o 

3420047 |k 


\C004680) putative protein 
nase [Arabidopsis thalianal 


2.4 1 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. 



ACCESSION 



Genbank) 



DESCRIPTION 



282174 



M33642 



~BNA~ 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) ' 



ACCESSION 



sequence from 
cosmid B20F6 on 
chromosome 22. 
complete sequence 
[Homo sapiens] 



F.solani STI35 
protein gene, 
complete cds. 



DESCRIPTION 



P VALUE 



0.58 



2 145455 ( Y07866) catalase-peroxidase 



0.58 



(AL02I897) hypothetical 
2896706 protein R v 1 069c 



2.4 



2.4 



U64873 



Mus musculus 
transforming growth 
factor alpha (TGF 
alpha) gene, partial 
cds 



AB 002 132 



vlacrophthalmus 
banzai mitochondrial 
DNA for 12S and 
16S rRNA, partial 
and complete 
sequence 



0.58 



3874437 



(Z81038) predicted using 
Genefinder; cDNA EST 
Iyk488a2.5 comes from this gene 
|[Caenorhabditis elegans] 



1.8 



0.58 



AF070070 



Caenorhabditis 
elegans MutS 
homolog (msh-5) 
mRNA. partial cds 



2960022 



(AJ224676) rho type GEF 
[Drosophila melanogasterl 



1.8 



AF045240 



Staphylococcus 
epidermidis plasmid 
pIP1629 mobilization 
protein (mobCl), 
(orf69-l). (mobAl). 



(U75869) Omp22 [Helicobacter 
0.58 I 4098205 pylori] 



1.8 



0.58 



X61637 



H.sapiens Wilms 
tumor gene 1, exons 3 
and 9 



42181 17 (AL035353) protein (fragment) 



0.62 



(U8821 1) unknown [Gallus 
0.58 | 2331059 gallusl 



0.62 



AF039312 



D87463 



U40342 



Moraxella catarrhalis 
strain 4223 transferrin 
binding protein A 
(tbpA) and transferrin 
binding protein B 
(tbpB) genes, 
complete cds; and 
unknown gene 
Human mRNA for 



0.58 



120155 



FIBER PROTEIN 

>gi|74229|pir||ERADFM fiber 
[protein - mouse adenovirus 1 

>gi|209758 (M30594) fiber 
[protein [Mastadenovirus musl] 



0.27 



KIAA0273 gene, 
complete cds 



0.58 



3861477 



l(U94177) androgen receptor 
[Pan troglodytes] 



0.12 



Mus musculus ninein 
mRNA. complete cds 



0.58 



4115936 



(AF 11 8223) No definition line 
| found [Arabidopsis thaliana] 



0.004 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. Non-Redundant Pr^sT 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE I 



369 



S57235 



CD68=1 idkda 

transmembrane 
glycoprotein [human, 
promonocyte cell line 
U937, mRNA, 1722 
ntl 



0.58 



2072301 



(U961I3)WWP1 [Homo 
sapiens] 



le-04 



370 



U39391 



Mus musculus 
serotonin 1 A receptor 
mRNA, complete cds. 



0.58 



1469876 



(D63481) The KIAA0147 gene 
product is related to adenylyl 
cyclase. [Homo sapiens! 



371 



D0O056 



Monkey B- 

lympho tropic 

papovavirus genes for 

VP-1.2.3and large 
T antigen, complete 
and partial cds, strain 
LPV-76 > :: 
gb|M14494|PPMVPl 
M Monkey B- 
lymphotropic 
papovavirus mutant 
(LPV-76) PstI B 
fragment encoding 
VP1, VP2, VP3 and 
antiaen. 



372 | M77182 



373 



S72579 



2Z±JAF018165 



Amsacta 
entomopoxvirus 
spheroidin gene, 
complete cds, and 
four vaccinia related 
orfs. > 

gb|I16670|I 16670 
Sequence 1 from 
patent US 54767S1 



loo-S=growth- 
associated protein 
GAP-43 homolog 



Tetraodon fluviatilis 
amyloid precursor 
protein mRNA. 
complete cds 



0.58 



2462069 



0.58 



J730722 



0.58 



2689720 



0.58 



3219938 



(AJ001774) vanadium 
chloroperoxidase 



Jropu. — _„ 
HVHUIKhilLAL4j.S KD ' 
PROTEIN IN NCE3-HHT2 
INTERGENIC REGION 
>gi|2131871|pir||S62957 
hypothetical protein YNL035c - 
yeast (Saccharomyces 
cerevisiae) 

>gi| 1 301 88O|gnI|PID|e239670 
(Z71311)ORFYNL035c 
Saccharomyces cerevisiae] 



(AF037168) DnaJ homologue 
Arabidopsis thaliana] 



HYPOTHETICAL 34.9 KD 
PROTEIN C57AI0.11CIN 
CHROMOSOME I 

gi|2058378|gnI|PID|e3 1 4002 
pombe] 



le-08 



7e-14 



5e-2I 



WO 01/02568 
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SEQ 
ID 


Neares 
ACCESSIOl 


t Neighbor (BlastN vs. 
V| DESCRIPTION 


Gen bank) 
1 P VALUE | 


Nearest Neigr 
ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 
I DESCRIPTION IpvArrrpI 


375 1 


U81803 


f-~ilobasidiella 
neoformans 
translation elongatior 
factor EF1 -alpha 
(CnTEFl)mRNA, 
complete cds 


i 1 

0.57 J 


<NONE> 


<NONE> 


<NONE>| 


376 1 


U09781 


Candida albicans 
ATCC 18804, CBS 
562 peptide 
transporter gene, 
complete cds. 


0.57 I 


<NONE> 


<NONE> 


<NONE>| 


377 1 


AC002143 


Homo sapiens *~ 
(subclone 4_bl0 from 
BAC H102) DNA 
sequence 


0.57 J 


<NONE> 


<NONE> 


<NONE> 1 


378 I 


U23442 


I'etrahymena 
thermophilaRR 
internal deletion 
sequence. 


0.57 1 


<NONE> 


<NONE> 


<NONE> j 


379 1 


U 17289 


Mus muscuius 
transcription factor 
AP-2 (AP-2) gene, 
alternative exon la. 
and isoform 2, partial 
cds. 


0.57 


<NONE> 


<NONE> 1 


<NONE> 1 


380 1 


] 
t 
\ 

X70844 r. 


3uzura suppressaria 
luclear polyhedrosis 
'irus gene for 
>olyhedrin protein 


0.57 1 


<NONE> 


<NONE> 


<NONE>| 


I f 

1 0 

381 1 AJ012I59 f> 


lomo sapiens 5T4 
ncofetal trophoblast 
lycoprotein gene 


0.57 1 


<NONE> 


<NONE> I 


cNONE> 1 


382 1 


h 
E 

X76571 c 


[.sapiens simple 
»NA sequence region 
one wgla8. 


0.57 1 


<NONE> 


<NONE> [, 


;NONE> 1 



3Sf 



WO 01/02568 
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SEQ 
ID 


Nearest Neighbor (BlastN vs 
ACCESSION | DESCRIPTION 


Gen bank) 
P VALUE 


1 Nearest Ne.shbor fBlastX vs. Non-Redund.mi Proteins) 

ACCESSION 1 DESCRIPTION PVAftrr: 


383 1 AF034434 


viui lu uiuieiut! 

[pathogenicity island 
mutative transposase 
Jaldehyde 
[dehydrogenase 
(aldA), toxR- 
lactivated gene A 
[protein (tagA), 
[putative inner 
[membrane protein, 
and putative zinc 
Jmetalloprotease 
genes, complete cds; 
|and... 


\ 

0.57 J <NONE> 


cNONE> 


<NONE>| 


l mus musculus gene 
1 for TESP4, complete 
384 1 AB01703I <-H« 


0.57 j <NONE> 






385 1 X89788 


p.nispiaus 
mitochondrial DNA 
for SSU ribosomal 
IRNA »ene 


0-57 J <NONE> 




386 1 L16921 


Rat progesteron 
receptor gene, 5' 
untranslated reaion. 


0.57 J 3323116 


(AE001251) femA protein, 
putative [Treponema palliduml 


8.9 I 


387 


f AF027292 


Homo sapiens 
interferon regulatory 
factor 6 


1 

057 j 259790 


(MS 157) DNA polymerase- ' 
primase ISO kda subunit 
Drosophila melanogaster. 
Peptide, 1490 aal 


6.7 1 


388 


1 J 
1 I 

1 AJ012581 < 


Cicer arietinum j 
mRNA for \ 
rytochrome P450 | 


I i 

0.57 J 2131498 c 


lypothetical protein YDR446w 
.'east CAI: 0.1 1 [Saccharomyces 
erevisiaej 


1 5,3 1 


389 


I I 

I 1 
1 F 

1 L15363 s 


<uman transfer RNA-I 
'let (TRMEP1) J 
seudogene, complete) 
ene 


I ( 

0.57 1 3228680 s 


AF070935) GABA receptor 
Jbunit fMusca domestica] 


5.2 1 


390 


1 ' 
1 n 

1 

AE000525 c 


lelicobacter pylori 
6695 section 3 of 
34 of the complete 
enome 


1 ( 

| f: 
J re 

0-57 J 1938478 el 


J97008) weak similarity to 
mily I of G-protein coupled 
ceptors [Caenorhabditis 
egans] i 


4.0 1 


391 j 


A 
ar 
ei 
(■ J 

AF020189 3' 


mbiyomma 
nericanum 
dysteroid receptor 
vamEcR) mRNA, 
JTR. resion 1 


1 < L 

0.57 j 2072224 vi 


94875) p40 [Borna disease 
us) 


4.0 1 



1>S V 
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WO 01/02568 
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ACCESSION 



401 



V00829 



402 



X53092 



Mouse complete gen57 
for a mouse kaJIikreinl 
gene. Genes are mGK| 
1 (complete gene) 
and mGK-2 of 
hormones, e.g., 
grow... > :• 

gb|J00390|MUSKAL 
07 Mouse pseudo- 
kallikrein 2, exons 4 
and 5, and kallikrein 
gene, complete cds. 



0.57 



Chicken mRNA for 
beta-2 subunit of 
neuronal nicotinic 
acetylcholine receptorl 



0.57 



ACCESSION 



DESCRIPTION 



P VALUE I 



2500916 



I NUCLEAR HORMONE 
RECEPTOR NOR-2 receptor 
{[Rattus norvegicus] 
>gi| 1583604|prf|(2 12 128 1 A 
NOR-2 protein [Rattus 
norvegicus! 



0.20 



1072256 



KU40953) similar to matrin F/G 
(SP:Q00910) containing C4- 
Itype zinc-fingers 
IfCaenorhabditis elegansl 



403 I L07939 



Ovis ovis granulocyte 
colony stimulating 
factor 



404 1 U18061 



Colletotrichum 
gloeosporioides 
CAP20 (cap20) gene 
complete cds. 



0.57 



3874345 



S iujoj predicted'using 

IGenefinder; Similarity to 
dehydrogenases; cDNA EST 
EMBL.D65800 comes from this I 
jgene; cDNA EST 
EMBL.D76184 comes from this j 
gene; cDNA EST 
EMBL:D69322 comes from this | 
gene; cDNA EST 
|eMBL:C08158 comes f.. 



3e-07 



405 



Z73955 



L.japonicus mRNA 
for small GTP- 
binding protein, 
RAB11G 



0.57 



J(AC003974) putative ubiquitin 
I specific protease 



9e 08 



0.57 



112894 



ALPHA-INDUCED PROTEIN 
3 (PUTATIVE DNA BINDING j 
PROTEIN A20) (ZINC 
FINGER PROTEIN A20) 
>gi|107549|pir||A35797 
probable DNA-binding protein 
A20 - human >gi| 177S66 
|(M59465) A20 



3^ 



WO 01/02568 
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WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



P VALUE 



Nearest iNe.gh bpr (Blast* vs. Non-Redundant Proteino 
ACCESSION 



|m. javanica 
mitochondrion 
ATPase 6, and 
putative tRNA-f-Met 
415 J X57 626 and tRNA-His genes 



ISus scrofaSIOOC 
416 I A B003363 [gene, complete cds 



lOanio rerio DANA 
417 1 L42291 lelement. imrnn 4. 



iMus musculus 
Jleukocystatin gene, 
418 1 AF031826 complete cds 



419 



U 17068 



IPennisetum glaucum 
|Ac-like element, 

AcL2. 



DESCRIPTION 



0.56 



<NONE> 



0.56 



<NONE> 



0.56 



2650002 



0.56 



462493 



420 | Z48042 



IH.sapiens mRNA 
jencoding GPI- 
lanchored protein 
3137 



421 I AF027657 



422 | AB01I540 



4231X59941 



-nonstoneura 
|fumiferana 
entomopoxvirus 
nucleotide 
triphosphate 
phosphohydrolase I 
(NPHI) gene, 
complete cds 



Homo sapiens mRNA 
for MEGF7, partial 
cds 



X.maculatus NGF 
gene for nerve growth 
factor 



0.56 



399449 



0.56 



141232 



p value! 



<NONE> 



<NONE> 
( AEOO 1 062) conserved 
hypothetical protein 
[Archaeoglobus fuleidus] 



L-LACfATE 
DEHYDROGENASE 
(IMMUNOGENIC PROTEIN 
P36) >gi|479296|pir||S33362 L- 
lactate dehydrogenase (EC 
1.1.1 .27) - Mycoplasma 
hyopneumoniae 



ESCARGOT/SNAIL PROTEIN 
HOMOLOG 



HYPOTHETICAL 8.7 KD 



PROTEIN (READING FRAME 
D)>gi|76316|pir||QQSA7C 
hypothe tical protein E-74 
PU1A11VL " 



0.56 



464999 



0.56 



1718033 



0.56 



1169081 



ACETYLCHOLINE 
REGULATOR UNC-1S 

|480359|pir||S36747 
acetylcholine regulator unc-18 - 
Cacnorhabditis elegans 
>gi|247392|bbs|100294 putative 
acetylcholine regulator unc-18 



UKACIL-DNA ' 
GLYCOSYLASE (UDG) 
herpesvirus 2 >gi|695219 
(U20824) uracil DNA 

glycosylase 

COMMON PLANT 

REGULATORY FACTOR 
CPRF-1 >gi|5 15621 (X58575) 
ight-inducible protein CPRF-I 
Petroselinum crispum] 
gi| I49S301 (U46217) CPRF1 



<NONE> 



<NONE> 



8.7 



6.7 



6.7 



6.7 



5.1 



5.1 



3.S 



9 <r ' 



<3> 



WO 01/02568 
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WO 01/02568 
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nearest Wei g hh or( m altN vs , Genbankj 

DESCRIPTION I py A[ [rF 



DESCRIPTION 



PkuitlN IN OLKI-SR09" 
INTERGENIC REGION 
>gi|83159|pir||S 19367 
hypothetical protein YCL039w -I 
yeast (Saccharomyces ' 
cerevisiae) ( Q35 



(AF038535) synaptotagmin VII 
[Homo sapiens 1 i 0 J? 




P VALUE! 



(AF030962) unknown 
Schistosoma mansonil 



(Z78411) F02D8.3 
[Caenorhabditis elepnsl 

(AC004665) unknown protein 
[Arabidopsis thalianal 



(AF072709) putative 
oxidoreductase [Streptomyces 
tividansl 



(U67951) contains similarity to" 
ATP/GTP-binding site motif 
(PS:PS00017) [Caenorhabditis' 
"legansl 



(U4I558) K02B2.3gene 
product [Caenorhabditis 
elegans] 



<NONE> 



0.11 



7e-06 



le-06 



le-10 



le-14 



(AF016452) similar to the beta 
Itransducin family | 4 e .ig 



6e-20 



2e-31 



<NONE> 
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SEQ 
ID 



Nearest Neighbor (BlasiN vs. Genbank) 



ACCESSION 



446 



447 



448 



449 



450 



DESCRIPTION 



D300IO 



U51991 



MI8858 



Rice mRNA EN1 17. 



Nearest Nei^hnrfBlastX vs. Non-Redundant Protein." 



P VALUE I ACCESSION 



partial sequence 



Escherichia coli 
phosphoprotein 
phosphatase 



0.55 



<NONE> 



DESCRIPTION 



Mouse T cell receptor! 
C-gamma-7. 1 mRNA,| 
3' end. 



0.55 



<NQNEj 



U95218 



Ml 4948 



AB002353 



451 | L81689 



Homo sapiens T cell- 
death associated 
protein gene, 
complete cds 



0.55 



<NONE> 



Human R-ras gene, 
exon 1. 



0.55 



<NONE> 



Human mRNA for 
KIAA0355 gene, 
complete cds 



0.55 



<NONE> 



Homo sapiens 
(subclone l_d6 from 
PI H54) DNA 
sequence 



0.55 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



P VALUEl 



<NONE> I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



0.55 



<NONE> 



452 I M6895-; 



453 I X62953 



<NONE> 



Human myristoylated | 
alanine-rich C-kinase 
substrate (MACS) 
gene, 5' end. 



0.55 



3322710 



R.norvegicus mRNA 

(pJG)16) with 

repetitive elements 
Synecnocystis sp. 



0.55 



1076802 



454 | L34630 



455 | U4352I 



mntABC transporter 
system: periplasmic- 
binding protein 
(mntC), complete cds;| 
(mntA) gene, 
complete cds; 
membrane protein 
(mntB) gene, 
complete cds. 



Plasmodium berghei 
merozoite surface 
protein- 1 gene. 
complete cds 



0.55 



2117632 



(AE001220) V-type ATPase, 
subunit B (atpB-1) [Treponema 
pallidum! 



extensin-like protein - maize 
>gi|6O0118 mays] 



<NONE> 



<NONE> 



5.0 



hydrogen dehydrogenase (EC 

• 12.1.2)- Clostridium 
acetobutylicum >gi|557064 
(U 1 5277) hydrogenase I 
'Clostridium acetobutylicuml 



0.55 



127654 ImyOGLOBDM 



5.0 



4.9 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlastN vs. ( 
i DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 


bor CBIastX vs. Non-Redundant P 


roteins) 
P VALUE 


456 


264937 


H.sapiens CpG DNA 
clone 17g7, reverse 


VJ.JJ 


417298 


MFS18 PROTEIN 
PRECURSOR 


3.8 


457 


UI0914 


Macaca mulatta clone 
irh83 T-cell receptor 
alpha chain mRNA, 
partial cds. 


0.55 


310406 


(L092I2) tat protein [Simian 
immunodeficiency virusl virus] 


3.8 


458 


AF022838 


Homo sapiens 

millllHrilD rKictanra 

iiiumuiug rcsiauancc 
protein 


0.55 


1585251 


traB gene [Amycolatopsis 
methanolica] 


2.8 


459 


M35603 


Mouse Hox-3. 1 gene 
and Hox-3.2-Hox-3.1 
inter, aenic region. 


0.55 


818849 


(U25430) nucleotide 
pyrophosphatase precursor 
[Oryza sativa] 


2.0 


460 


AE001395 


x luallHJUlUIU 

falciparum 
chromosome 2. 
section 32 of 73 of 
the complete 
sequence 


0.55 


137532 


PROTEIN C2 

>gi|74386|pir||WZVZB6 59K 
Hindlll-C protein - vaccinia 
virus (strain WR) 


1.7 


461 


AE001 10S 


Plasmodium 
falciparum 
chromosome 2, 
section 32 of 73 of 
the complete 
sequence 


A £ C 
O.DJ 


137532 


PROTEIN C2 

>gi|74386|pir||WZVZB6 59K 
Hindlll-C protein - vaccinia 
virus (strain WR) 


1.7 


462 


U59736 


Human transcription 
factor (NFATc.b) 
mRNA, complete cds 


0.55 


3327144 


(ABO 14565) KIAA0665 protein 
Homo sapiens 1 


0.096 


463 


i 
I 

U34860 ( 


Saccharomyces 
:erevisiae origin 
•ecognition complex 
arge subunit (ORC1) 
»ene. complete cds 


0.55 


■ 

140372 t 


HYPOTHETICAL 86.0 KD 
PROTEIN IN GLKI-SR09 
[NTERGENIC REGION 
>gi|83159|pir||S 19367 
lypothetical protein YCL039w - 
/east (Saccharomyces 
•erevisiae) 


0.017 


464 


I 

c 
c 

C 
( 

AF012341 € 


-lomo sapiens 
»lutaryl-CoA 
ehydrogenase 
GCDH) gene, exons 
.7. 8, 9, and 10 


0.55 


( 
< 

f 

y 

b 

1166611 s 


U46674) coded for by C. 
legans cDNA yk27d9.5; coded 
or by C. eleaans cDNA 
k27d9.3; short region of weak 
omology to drosophilia 
uppressor of sable protein 


0.00s 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


Neares 
ACCESSIOI 


Neiehbor (BlastN vs. 
M DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
UfcSt_Kir I ION 


roteins) 
P VALUE 


465 


J AF004891 


HJV-l isolate Q98- 
CxA from Kenya, 
envelope 

glycoprotein C2V3 
region (env) gene, 
partial cds 


0.54 


<NONE> 


<NONE> 


<NONE> 


466 


1 Y10159 


D.discoideum 
racGAP gene 


0.54 


<NONE> 


<NONE> 


<NONE> 


467 


1 AB0O1895 


Homo sapiens mRN/» 
for B 120, complete 
cds 






<NONE> 


<NONE> 


468 


X12357 


Bovine gene tor 
aspartyl protease 
NM 1 exons 3 and 4 > 
::!cl|X12357 Bovine 
aspartyl protease 

and 4. 


0.54 


<NONE> 


<NONE> 


<NONE> 


469 


AE001151 


Borrelia burgdorferi 
(section 37 of 70) of 
the complete genome 


0.54 


<NONE> 


<NONE> 


<NONE> 


470 


X92052 


H.sapiens mRNA for 
T cell receptor alpha 
chain 


U.JH 


<NONE> 


<NONE> 


<NONE> 


471 


U00938 


Mus musculus ileal 
lipid-binding protein 
gene, complete cds 


0.54 


1009712 


(U27698) calreticulin 
[Arabidopsis thalianal 


4.9 


472 1 


J 
< 

X68367 f 


vl.thermoformicicum 
•omplete plasmid 
jFZI DNA 


0.54 


( 

; 

125272 I 


UMiCUN MiNAifc 11, ALfHA 

CHAIN (CK II) 
>gi|419938|pir||A43297 casein 
kinase II (EC 2.7.1.-) alpha 
:hain - Theileria parva 
>gi|l61871 (M92084) casein 
cinase U alpha subunit 
Theileria parva] 


4.7 


473 1 


I 

c 

Z61098 r 


Lsapiens CpG DNA. 
lone 44c4, reverse 
ead cpg44c4.rtla . 


0.54 


( 

4191274 r 


AJ131094) Xvcnt-IB protein 
Xenopus lacvisj 


3.7 


474 J 


¥ 

f 
s 

M63962 c 


luman gastric H,K- 
^TPase catalytic 
ubunit gene, 
omplete cds. 


0.54 


( 
P 

388164S r 


Z70757) similar to serine 
rotease inhibitor 
"aenorhabditis eleaans) 


3.7 


475 1 


Y 

XS6019 P 


(.sapiens mRNA for 
RPL-2 protein 


0.54 


( 

164882S (] 


387963) ETF-related factor- 1 
ETFR-1) 


2.1 
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PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



4761X89010 



477 | AB007836 



S.glaucescens genes 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 



P VALUE 



ACCESSION 



strU, strX. strV and 
strW for 5'- 
hydroxystreptomycin 
pruduction and 
transport 

polypeptides 



Homo sapiens mRNA 
for Hic-5. partial cds 



DESCRIPTION 



0.54 



3550345 



0.54 



1097213 



P VALUE 



(AF084524) cellular repressor 
of ElA-stimulated genes CREG 
[Mus musculus] 



ORF 1 [Streptomyces 
lavendulae] 



0.25 



478 



U32622 



479 



D6I394 



Comamonas 
testosteroni TsaR 
(tsaR), 

toluenesulfonate 
methyl- 

monooxygenase 
oxygenase component 
component (tsaB), 
toluenesulfonate zinc 
indepedent alcohol 
dehydrogenase.. . 



Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 



0.54 



3875351 



(Z96047) DY3.6 
[Caenorhabditis elegansl 



0.006 



0.53 



<NONE> 



<NONE> 



<NONE> 



480 I D61394 



Arabidopsis thaliana 
gene for beta-VPE, 
complete cds 



0.53 



481 



Z33072 



4S2 



U45975 



483 



Z71324 



484 



L32090 



M.capricolum DNA 
for CONTIG MC097 



<NONE> 



<NONE> 



<NONE> 



0.53 



<NONE> 



<NONE> 



<NONE> 



Human 

phosphatidylinositol 
(4,5)bisphosphate 5- 
phosphatase homolog 
mRNA. partial cds. 



S.cerevisiae 
chromosome XIV 
reading frame ORF 
YNL04Sw 



Listeria 

monocytogenes secA 
gene, complete cds. 



0.53 



<NONE> 



<NONE> 



<NONE> 



0.53 



2135586 



Ml 30 antigen (cytosolic variant 
1) - human 



0.53 



2291 129 



(AF016415) No definition line 
found [Caenorhabditis elegansl 



0.70 



WO 01/02568 
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I SEQ 
ID 



Nearest Neighbor fBlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



485 



D86423 



486 | Y 15969 



487 I M2748Q 



488 | D87004 



Mus musculus mRNA 



for HGT keratin. 
partial cds 



P VALUE 



Nearest Neighbor (BlastX vs. Nnn-Redundant Protei^T 



ACCESSION 



DESCRIPTION 



Mus musculus V 
kappa 21-6 gene, 
iartial 

Mus musculus (clone 
3F9) transcribed 
germline Tcell 
receptor gamma chain 
(Tcr-g) mRNA, VJ4 
C4 region 



0.53 



1235974 



0.52 



<NONE> 



0.52 



489 | Z99704 



Human (lambda) 
DNA for 

immunogloblin light 

chain | 053 

Human DNA 
sequence from 
cosmid E75B8 on 
chromosome 22, 
complete sequence 
[Homo sapiens! | 0.51 



3875542 



P VALUE I 



(X967 13) collagen [Globodera 
pallidal | 0.41 



<NONE> 



(267990) Similarity to Rat 
amiloride-sensitive sodium 
channel beja-subunit 



I <NONE>| 



4.6 



1766073 



(U37272) winged helix protein 
ICWH-1 fGallus pallusl 



490 I U76523 



491 I U32795 



<NONE> 



<NONEb 



l<NONE> 



492 I Ml 4602 



Sambucus nigra lectin! 
precursor mRNA, 
complete cds | 0.5 1 

Haemophilus 
nfluenzae Rd section j 
110 of 163 of the 
complete genome | 0.50 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



493 I D87075 



Human myoglobin 
jjene, exon 2. | 0.49 



Human mRNA for 
KIAA0238 gene, 
partial cds [ 0.24 



478384 



1938429 



Ihelicase homolog glOL protein - 
[African swine fever virus 
>gi|414091 (X72951) G10L 125 

KDa protein I 7 0 

j(Uy/U02) similar to 
ISchizosaccharomyces pombe 4 
Initrophenylphosphatase 
(PNPPASE) (SP:Q00472. 
NID:g5004) [Caenorhabditis 
elegans] I 2.5 



Xenopus laevis 
mitotic 

phosphoprotein 90 
494 I U95102 ImRNA. complete cds [ O.-n 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlasiN vs. Genbank) 



Nearest Nei ghbor (BlastX vs. Non-Redundant Proteins)" 
ACCESSION 1 DESC RIPTION IPVALUEl 




>4 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundam PrntpinO 


SEQ 
ID 


) 

ACCESSIOf 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















506 


M24543 


Human prostate- 
specific antigen (PA) 
gene, complete cds. 


0.21 


1938527 


(U97012)C04E6.2gene 
product [Caenorhabditis 
eleaans] 


2.7 


507 


M62470 


Mouse 

thrombospondin 
(THBSl)gene. 
complete cds. 


0.21 


548563 


Rna REPLlc'ASe 

POLYPROTEIN 2.7.7.48) - 
Erysimum latent virus 
>gi|3892232 (AF098523) 
replicase protein [Erysimum 
latent virus] 


2.1 


508 


Y I 3544 


Homo sapiens cosmic 
CI 


0.21 


1235710 


(L40584) polyprotein 
[Infectious pancreatic necrosis 
virus] 


2.0 


509 


M24193 


Chicken MHC B 
complex protein (C12 
3) mRNA, complete 
cds. 


0.21 


3600102 


(AF090441) extracellular reelin 
Gallus gallus] 


0.52 


510 


X97161 


H.sapiens TFE3 gene, 
exon 4,5 & 6 


0.21 


854065 


(X83413) U88 [Human 
herpesvirus 6] 


0.30 


511 


X67649 


R.norvegicus DNA 
sequence for 
LFB1/HNFI 
promoter 


0.21 


3913114 


TRANSCRIPTION FACTOR 
COUP 2 COUP-TFII - chicken 
>gi|392817 (U00697) orphan 
receptor COUP-TFII [Gallus 
gallus] 


0.004 


512 


U63807 


Fugu rubripes growth 
hormone (GH) gene, 
complete cds j 


0.21 


3510505 


(AF0308S1) pol polyprotein 
[Fuau rubripes] 


3e-04 


513 


Z95636 


-1. sapiens mRNA for 1 
aminin alpha 5 chain [ 


0.21 


400350 


NAM7 PROTEIN (NONSENSE 
MEDIATED MRNA DECAY 
PROTEIN 1) (UP- 
FRAMESH1FT SUPPRESSOR 
1) factor NAM7 - yeast 
'Saccharomyces cerevisiae) 
>gi|4023 


le-07 


514 


] 

1 

c 
( 

U91907 c 


vlirounga leonina f 
■najor 

listocompatibility I 
:omplex class II 1 
DQA) gene, partial 
ds 


0.20 


<NONE> 


<NONE> 


<NONE> 


515 


1 

i 
1 

235758 g 


transmissible 1 
astroenteritis virus 1 
"FI virion protein 
enes 


0.20 


<NONE> 


<NONE> 


<iVONE> 


516 


[ 

s 

X00334. s 


)rosophila virilis I 
imple DNA 
equence (pDv-19) | 


0.20 


<NONE> 


<NONE> 


cNONE> 



3^5 



WO 01/02568 
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Neares 


t Neighbor (BlastN vs. Genbank) 


I Nearest NeiehhorCRi^.v .„ m^, d„^.-^—. - , 


SEQ 

ID | 


ACCESSIO 


N DESCRIPTION 


P VALUE 


1 ACCESSION 


DESCRIPTION 


P VALUE 


517 | 


1T1 / U / H I 


Homo sapiens biliar 
glycoprotein (BGP) 
_ gene, partial cds. 


t I 

0.20 


J <NONE> 




<NONE> 
<NONE> 


518 | 


D7SS 1 S 


Mus musculus rae28 
gene, exon 1 and 
5'flankins reeion 


f 0.20 


I <NONE> 


_. <NONE> 

<NONE> 


519 I 


M62975 


Drosophila 
melanogaster RNA 
polymerase II second 
largest subunit 
upstream (DmRP 
140) gene, exons 1-4. 


J 0.20 


! <NONE> 


<NONE> 


<NONE>| 


520 1 


M27260 


Chicken 78-kD 
glucose-regulated 
protein, complete cds. 


0.20 


<NONE> 


<NONE> 


<NONE> I 


521 J 


AF076470 


Rice tungro 
bactlliform virus 
Serdang strain, 
complete genome 


0.20 


<NONE> 


<NONE> 


<NONE>| 


522 I AF076470 


Rice tungro 
bacilliform virus 1 
Serdang strain, 
complete acnome | 


0.20 1 


<NONE> 


<NONE> 


<NONE> 1 


523 1 


U04636 ( 


Human I 
cyclooxygenase-2 1 
hCox-2) gene, 1 
:omplete cds. 1 


0.20 


<NONE> 


<NONE> 


<NONE> J 


1 f 

1 C 

1 S 
1 t 

524 1 AE00143O .<; 


-■lasmodium j 
alciparum 1 
hromosome 2, 1 
ection 67 of 73 of J 
*ie complete 1 
ecjuence 1 


0.20 1 


<NONE> 


<NONE> 


c.NONE> 1 


1 N 

P 

(I 

525 | AF0435I4 r, 


lus musculus \ 
tiosphomannomutasel 
>mm2) mRNA. 1 
smplete cds | 


0.20 1 


P 
II 
> 
h 

3025006 fl 


IVpOtHeTICaL 15\5 KB 

ROTEIN IN MOAE-RHLE 
VTERGENIC REGION 
gi| 1 787009 (AEOOO 1 8 1) orf. 
ypothetical protein 
Escherichia coli] 


9.S J 
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SEQ 
ID 



_Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION! DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Prnteino 



ACCESSION 



526 1 U23144 



IXenopus laevis FTZ 
IF 1 -related nuclear 

orphan receptor 

variant (xFFlrAshort) 
ImRNA. complete cds. 



DESCRIPTION 



0.20 



3184402 



527 I U14621 



iParacentrotus lividus 
|Pax-6 (suPax-6) 
ImRNA. complete cds 



0.20 



465894 



(AB0I4477) period protein 

[Chymomyz a costata] 

PROBABLE iWILRUiUMAL 

SIGNAL PEPTIDASE 23 KD 

SUBUNIT (SPC22/23) 

>gi|630688|pir|JS44854 
KI2H4.4 protein - 
Caenorhabditis elegans 
>gi|289708 (L14331) homology 
with signal peptidase; coded for 
by C. elegans cDNAs GenBank: 
M79661, M79662 and M79663; 
putative 



[p value! 



9.6 



528 I AF0305U 



Actinobacillus 
Ipleuropneumoniae 
MRP ATPase 
Ihomolog (mrp) gene, 
partial cds; ApxIVA 
var3 (apxIVA) gene, 

(complete cds; and 
beta-galactosidase 

l(lacZ) gene, partial 

Icds 



0.20 



1175966 



529 I AFO70581 sequence 



iHomo sapiens clone 
24540 mRNA 



530 1 X75437 



T.maritima pgK gene 
for 3- 

phosphoglycerate 
kinase 



531 



[Haemophilus . . 
[influenzae Rd section 
1 of 163 of the 
JJ32686 complete genome 



532 228081 



S.cerevisiae 
chromosome XI 
reading frame ORF 
IYKL081 w 



0.20 



542394 



0.20 



825648 



0.20 



3309593 



HYPOTHETICAL 45.3 KD 
PROTEIN IN THI5 5'REGION 
>gi|1084720|pir|JS56193 
probable membrane protein 
YFL062w - yeast 
(Saccharomyces cerevisiae) 



glyoxal oxidase (EC 1.2.3.-) 
precursor - basidiomycete 
(Phanerochaete chrysosporium) 
>gi|10503O2 



(Z34531) coproporphyrinogen 
oxidase [Homo sapiens! 



0.20 



2507201 



(AF072878) ciliary outer arm 
dynein beta heavy chain 



CAKBON CATABOLITE 
DEREPRESS ING PROTEIN 
KINASE >gi| 1469803 (L78129)l 
serine/threonine kinase [Candidal 
albicans] 



7.2 



5.8 



5.8 



5.6 



5.5 



WO 01/02568 
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"Slot 



WO 01/02568 
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SEQ 

JD I ACCESSION 



Nearest Neighbor (BlastN vs. Genbank) 



DESCRIPTION 



N earest Ne.ghbp r (Bl astX vs. N nT .^...^-. 



Homo sapiens mRNA 
for KIAA0607 
542 ' AB0U179 protein, partial rd. 



543 | X75318 



[H.sapiens ITIHI gene 
|(exon 22) and ITIH3 



[Oncorhynchus mykiss 
I mRNA for alpha 3 
type I collagen, 
544 I AB008374 partial cds 



P VALUE I ACCESSION 



DESCRIPTION 



P VALUE I 



0.20 



2U3753 



0.20 



629557 



545 



U09809 



[Limulus polyphemus 
arginine kinase 
mRNA. complete cds 



_546_|aB02067 1 



Homo sapiens mRNA 
for KIAA0864 
irotein, panial cds 



[p rotein, pai 
phytophthora" 
megaspcrma 
[mitochondrial 
ORF152, complete 
[cds, cytochrome c 
joxidase subunit I 
l(coxl) gene, 
Jcomplete cds, 
[cytochrome c oxidase 
Isubunit II 
iHiytopntnora 

Imegasperma 
Imitochondrial 
ORF152, complete 
[cds, cytochrome c 

[oxidase subunit I 
(coxl) gene, 

[complete cds. 

[cytochrome c oxidase 

Isubunit II 



0.20 



1082610 



0.20 



3882016 



547 [ L04457 



54S I L044S7 



0.20 



2674350 



0.20 



746516 



Jgene V<^ protein - rat ' 

>gi|205690 (M60525) nerve 
[growth factor inducible protein 
MRattus norvegicus] >gi|205701 
[(M60522) nerve growth factor- 
finducible protein [Rattus 
norvegicusl >git207651 



0.20 



746516 



KNA-binding protein mpD - 
Arabidopsis thaliana (fragment) 
|>gi|5 1 0240 (X6 1 1 08) RN A 
[binding protein [Arabidopsis 
Ithalianal 



0.39 



[muf 1 protein - human 
>gi|762953 (X860I8) mufl 
[[Homo sapiens! 



(AJ012650) CP [Papaya 
Iringspot virusl 



(U93121)M-phase 
jphosphoprotein-1 [Homo 
[sapiens! 



0.38 



0.37 



0.37 



0.1S 



(U235 17) D 1022.7 
[[Caenorhabditis elegansj 
>gi|3258651 elegansj 



(U235I7) D 1022.7 
[[Caenorhabditis elegansj 
l>gi|3258651 clcgans] 



0.043 



0.04: 
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SEQ 
ID 


Nearest Neishbor rBlastN vs 

ACCESSION | DESCRIPTION 
|CdIor=cvclin- 


Cenbank) 
P VALUE 


Nearest Neip 
ACCESSION 


hbor(BlastX vs. Non-Redundant] 
DESCRIPTION 


"roteins) 1 
P V a f r rc 1 


549 1 S828I9 


jdependent kinase 5 
(regulatory subunit 
p35 [mice, brain, 

129/SvJ C57RF Jf, 

Genomic/mRNA, 
5528 ntl 


0.20 1 


3413870 


(AB007923) KIAA0454 protcit 
[Homo sapiens] 


1 J 1 

0.020 1 


550 1 D3I79? 


streptomyces griseiu 
DNA for 
serine/threonine 
protein kinases, 
complete cds 


0.20 1 


861405 


(U29154) T07F12.2 gene 
product [Caenorhabditis 
lelegans] 


0.019 J 


551 1 U97499 


Homo sapiens 
butyrophilin (BT3.2) 
gene, exons 5-10, and 


1 

0.20 1 2773341 


(AF040954) putative protein 
phosphatase 1 nuclear targeting 
subunit [Rattus norvegicusl 




552 


1 U3I463 


Rattus norvegicus 
nonmuscle myosin 
heavy chain-A 
mRNA. complete cds. 


O.20 1 


3880111 


v * * — 'v/y pituiLicu us ins 
Genefinder 


0.008 J 
0.002 j 


553 


1 X78401 « 


Bacteriophage P22 
right operon, orf 48, 
replication genes 18 
and 12, nin region 
genes, ninG 
phosphatase, late 
control gene 23, orf 
60, complete cds, late 
control region, start 
of lysis gene 13 


0.20 1 


( 
I 

1123087 e 


U42436) C49H3.3 gene 
product [Caenorhabditis J 
legans] 1 


4e-04 J 


554 1 


■ I 
1 
s 
a 
c 
s 
i.< 

X57310 s 


Vocardia 

actamdurans pcbAB 
nd pcbC genes for 
lpha-aminoadipyl-L- 
ysteinyl-D- valine 
ynthetase and 
openicillin N 
ynthase 


0.20 1 


P 

C 
<i 
> 

1723511 fS 


UTATIVE ENDONUCLEASE 
1F12.06C yeast j 
Schizosaccharomyces pombe) 1 
gi|!2I79S0 (Z69944) unknown 
chizosaecharomyces pombel 1 


4e-09 1 


555 | 


S 
e l 

X62386 ej 


.epidermidis genes 
aiY'. epiY, epiA, 
JiB.epiC. epiD, 
>iQ. epiP 


0.20 | 


(Z 

3874927 rc 


73424) C44B9.1 
aenorhabditis elegans] 1 


3e-10 1 
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SEQ 

J D Iaccession 



Nearest Neighbor rBlastN vs. GenbanH 



DESCRIPTION 



nearest .Ne.ghbor <Kt^ T H on .K edundant p 



P VALUE I ACCESSION 



DESCRIPTION 



U/HU^zu; similar to nucTeoti3e 



P VALUE I 



jbinaing protein; dJNA EST 
EMBL:M75897 comes from this 
gene; cDNA EST 
EMBL:M89054 comes from this 
gene; cDNA EST 
EMBL:D26713 comes from this 
Igene; cDNA EST 
[EMBL.D267I8 comes from this 
gene; cDNA... 




. 566 | Z8262S 



<NONE> | 



<NONE>| 



<NONE> [ 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Oenbank) 



SEQ 

i£_|ACCESSION 




WO 01/02568 



PCT/US00/18374 



SEC 
ID 


. Nearest 
ACCESSIOP 


Neighbor (BlastN vs. ( 

i DESCRIPTION 
Human mRNA for 


jenbank) 
P VALUE 


I Nearest Nei eh 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VAI ITF 


576 


AB002333 


KIAA0335 gene, 
complete cds 


0.19 


1 <NONE> 


<NONE> 


<NONE> 


577 


U53566 


Macaca mulatta pit- 
1/GHF-l 

transcription factor 
mRNA. complete cds 


0.19 


I 1078068 


probable membrane protein 
YLR311c - yeast 


9.2 


578 


U73664 


Human 

t(Il;14)(ql3;q32) 
breakpoint junction 
sequence 


0.19 


116734 


COAT PROTEIN (CAPS ID 
PROTEIN) virus >gi|58901 
(X62133) CyMV coat protein 
gene product 


8.8 


579 


AF004054 


Heterophyllaea 
pustulata rpsI6 gene, 
chloroplast gene, 
partial intron 
sequence 


0.19 


1928991 


(U92815) heat shock protein 70 


8.7 


580 


Z27081 


Caenorhabditis 
elegans cosmid 
MO 1A8. complete 
sequence 

r/~* t .... 

Caenorhabditis 
elegans] 


0.19 1 


2496247 


hypothetical atp- 
binding protein mj0625 

>gi|2128413|pir||A64378 
hypothetical protein MJ0625 - 
Methanococcus jannaschii 
>gi|1591336 (U67510) M. 
annaschii predicted coding 
region MJ0625 


8.6 


581 


Z74145 
" 


S.cerevisiae 
chromosome IV 
'eading frame ORF 
VDL097C 


0.19 1 


1174425 1 


rYROSINE-PROTEIN 
<INASESPK-1 


6.7 


582 


s 
£ 

. : 

c 

D38547 C 


>mall round 
tructured virus 
;enomic RNA, 
'terminal sequence • 
ontaining ORF2 and 
)RF3 


0.19 1 


( 

971318 I 


Z48053) putative protein 
3ovine herpesvirus 1] 


5.1 



-T73 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



accession! DESCRIPTION P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant ProteinsT 



ACCESSION 



DESCRIPTION 



P VALUEl 



583 I D88000 



584 | U67462 



585 | L23906 



586 | AE0O1462 



dna 16S imusomal 

RNA > :: 

dbj|D880O2|D880O2 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA > :: 
dbj|D88003|D88003 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA > :: 
dbj|D88004|D88004 
Ralstonia eutropha 
DNA for 16S 
ribosomal RNA 



Methanococcus 
jannaschii section 4 
of 150 of the 
complete genome 



Gallus domesticus 
microsatellite DNA 
marker. 



Helicobacter pylori, 
strain J99 section 23 
of 1 32 of the 
complete .genome 



0.19 



3800952 



(AF100657) No definition line 
found [Caenorhabditis elegansl 



0.19 



3I836I7 



0.19 



1947094 



587 | M 19460 



588 



U22349 



P.putida catBC 
operon encoding 
cis.cis-muconate 
lactonizing enzyme I 
and muconolactone 
isomerase genes, 
complete cds. 



Tetrahymena australis 
telomerase RNA 
genc. complete 
sequence 



0.19 



1730177 



0.19 



3873843 



0.19 



4105782 



(AJ0O5586) MYB-related 
transcription factor 
[Antirrhinum majus] 



(U93074) voltage-gated sodium 
channel homolog BdNal 



(AF049922) PGP 169- 12 
'Petunia x hybrida] 



5.1 



4.0 



3.9 



GLUCOSE-6-PHOSPHATE 

ISOMERASE (GPI) 

ISOMERASE) (PHI) 

>gi|2118333|pir||I48073 glucose 

phosphate isomerase - Chinese 

hamster >gi| 987046 eriseusl j 3 9 
U8225bj cDNA hSl' ' 

yk251g7.3 comes from this 

gene; cDNA EST yk251g7.5 

comes from this gene; cDNA 

EST EMBL:D68223 comes 

from this gene; cDNA EST 

EMBL:C 12737 comes from this. 

gene; cDNA EST yk389c8.5 

comes from this gene; cDNA 

E- " I 3.9 



3.2 



V1 1 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. C. pnh.-iniM 

SEQ[ 

10 I ACCESSION 



589 



L27745 



590 AF049588 



DESCRIPTION | p VALUE 



Homo sapiens voltage 
operated calcium 
channel, alpha- 1 
subunit mRNA, 
complete cds. 
Canis familiaris 
synapsin I gene, 
partial cds 



591 | X06627 



592 | X61597 



Staphylococcus 
aureus plasmid pS194 
sequence 



M.musculus gene for 
kallikrein-binding 
protein 



ACCESSION 



0.19 



3763926 



0.19 



4104931 



0.19 



137927 



593 | AF01624? 



594 I AF0Q4447 



595 



J04S2 I 



596 I AFQ59650 



Dictyostelium 
discoideum protein 
synthesis elongation 
factor 1 -alpha (tef2) 
gene, panial cds 
Venezuelan equine — 



0.19 



2982874 



ant I 



DESCRIPTION 



(AC004450) unknown protein 
I Arabidopsis thalianal 



(AF042196) auxin response 
fa ctor 8 fArabidopsis thal ianal 
P KE-NELK. ApPkMUAUh ~ 
PROTEIN (LATE PROTEIN 
GP12) >gi|75856|pir||WMBP22 
gene 12 protein - phage phi-29 
>gi|2 15330 (M14782) pre-neck 
appendage protein 
Bacteriophage phi-29] 
•gi|225367|prf||1301270G gene 
12 [Bacteriophage phi-29] 



encephalitis virus 
strain 1327 
polyprotein gene, 
partial cds > :: 
gb|AF004460|AFOO 
460 Venezuelan 
equine encephalitis 
virus strain 1385 
polyprotein gene. 
partial cds 



133659 



Human elastin (ELN) 
ene, e.xon 1. clones 
HELC-5 and HELC- 



PUTATIVE RNA-DERECTED 
RNA POLYMERASE 



Homo sapiens histone 
deacetylasc 3 
(HDAC3) gene, 
mplete cds 



0.19 ] 


4096173 


(U25968) early embryogenesis 
protein fOryza satival 


1.3 ! 


0.19 1 


1170523 


1INH1BIN BE1 A B CHAIN 
PRECURSOR inhibin precursor 
- bovine >gi |563753 (U 1 6241) 
t>etaB inhibin/activin precursor 
Bos taurusl 


1.3 1 






KKOBABLE TRANSPORT ~ 
PROTEIN CY21C12.il 




0.19 1 


3024881 i 


>gi|2078066|gnl|PID|e3 15 1 7 1 
Z952IO)betP 


0.83 1 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



..Nea rest Neighbor (BlasiX vs. Non-R edundant tWi-i 



DESCRIPTION I P VALUE I AfrF.«rn M 



.597 | M69053 



D.melanogaster 
calcium-activated K-t- 
channel subunit 



598 I AF076279 



Diccyostelium 
firraibasis plasmid 
Dfpl, complete 
plasmid sequence 



0.19 



0.19. 



599 



Mouse MCNP gene 
for C-type natriuretic 
peptide, complete cds I 
D28S73 (exonl,exon2) 



Oxytricha nova 
Imacronuclear actin II f 
- 600 I J/0607I gene, complete cds. ' 

, [Homo sapiens CLP 

- 601 I . L54057 ImRNA. partial cds. 

iP.lividius cDNA for 
602 I X89 806 COLL2alpha gene 



0.19 

0.19 
0.19 



lArchaeoglobus 
[fulgidus section 3 of 
1 1 72 of the complete 
603 AE0011Q4 genome 



0.19 



0.19 



606 



(Human Gpsl (GPS I) ( 
U20285 ImRNA. complete cds j 
(Human gene for 
[interleukin 3 receptor I 
[alpha subunit, exon 
10 



0.19 



607 j D49408 



DESCRIPTION 



1707984 



0.19 



(FD-GOGAT) 

>gi|2I26524|pir||S60228 
jglutamate synthase (ferredoxin) 

(EC 1.4.7.1) gltB- 

Synechocystis sp. (PCC 6803) 
[>gi|5 15938 (X80485) glutamate 
[synthase 



P value! 



0.80 



453986 



(U00008) yejA [Escherichia 
|coli] 



0.79 



2650444 

1584024 
3036883 



(AE001092) acetyl-CoA 
synthetase (acs-1) 
[[Archaeoglobus fulgidusl 

[complement control protein 
[[Botryllus schlosseri] 
(AL022374) putative ABC 
transporter 



0.63 



3638957 



(AC004877) sco-spondin-mucinJ 
like; similar to P98167 uncertain 
[Homo sapiensl | 0.41 



2315192 



[Rattus norvegicus 
[microsatellite 

U54501 [sequence D0Mco22 [ 0.19 | 2289SI 



|(Y1 1739) transcription factor 
[Homo sapiensl 



|D-MeAsp 

receptor: IS OTYPE=epsi lon3 
[[Mus musculus] 



0.35 



0.32 



[Human 

. [papillomavirus type 

605 J X74468 1 15 geno mic DNA [ Q.| 9 1 3595390 



(AF096371) contains similarity 
to Rattus norvegicus cyclin G- 
lassociated kinase (SW:P97874) 
[Arabidopsis thaliana] ) 0.28 



2582659 



(AJ002527) glucitol-6- ~ 
[phosphate dehydrogenase 
[[Clostridium beijerinckiil 



0.27 



252236S 



(AF008596) alpha 1,3- 
jfucosy I transferase [Helicobacter! 
'Py |or '1 I 0.16 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


1 Neares 
ACCESSIO 


t Neighbor (BlastN vs. 
N DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neig 
ACCESSION 


ibor (BlastX vs. Non-Redundant 1 
DESCRIPTION 


J roteins) | 
P VALUE 


608 


AF04I141 


Homo sapiens 
pituitary specific 
homeodomain protei 
(PROP1) gene, exon 
3 and complete cds 


n 1 

0.19 I 


37403 


1 X0354 I ^ trie o^n*» nrrv4ii/<r /„ „ 

641) [Homo sapiens] 


t 

0.091 1 


609 J 


L12531 


Discopyge ommata 
Ca2+ channel alpha 
subunit gene 
sequence. 


1 1 

0.19 f 


3618274 


(AJ223219) hypothetical proteii 


i 0.069 1 


610 1 


AF052445 


Yellow fever virus 
clone HONG9 
polyprotein gene, 
complete cds 


0.19 1 


1932822 


(U15928) KH-domain putative 
_ RNA binding protein 


0.0O1 I 


611 


Z36946 


B.anthracis sap gene 
encoding S-layer 
protein 


0.19 1 


173241 


(L06487) ZIP I protein 
[Saccharomyces cerevisiael 


2e-04 I 


612 1 


AF087984 


Homo sapiens full 
length insert cDNA 
clone YW29A12 


0.19 1 


3786014 


fAC005499'> hvnothptirol 

protein [Arabidopsis thalianal 


le-06 1 


613 1 


AE001010 


Archaeoalobus 
fulgidus section 97 of 
172 of the complete 
genome 


0.19 I 


3135493 


(AF06C4S) unlfnnwn 
Arabidopsis thalianal 


7e-08 I 


614 1 


L08965 


Trichosporon 
cutaneum carbamoyl 
Jhosphate synthetase 
arge subunit (argA) 
lene. partial cds. 


0.19 I 


1086901 , 


(U41278) F33G12.3 gene 
product [Caenorhabditis 
ilepanil 


2e-08 1 


615 I 


j 
r 

M9I466 c 


<attus norvegicus 
\2b-adenosine 
eceptor mRNA, 
omplete cds. 


0.19 1 


( 

2984320 


AE000773) acetoin utilization 
irotein fAquifex aeolicusl 


6e-09 1 


1 S 

616 1 X9597I o 


lividans groEL2 
:ne 


0.19 I 


t 
I 
F 

y 
g 

c 

3925277 K 


Ai_uj^b4jj similar to 
Jncharacterized protein family 
JPF0034, Double-stranded 
.NA binding motif; cDNA EST 
k489b3.5 comes from this 
ene; cDNA EST yk439g7.5 
Dmes from this gene 
raenorhabditis elegansl 


7e-10 I 


1 S< 

I P £ 

617 | U12539 


:hizosaccharomyces 
>mbe scd2 (scd2) 
ne, complete cds. 


0.19 | 


a 

R 
P' 

193S549 (b 


J97016) similar to drosophila 
Icl gene product ribosomal 
otein L4 (YML4) 
ftD:e459259) 


3e-14 jj 
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Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



618 



U12539 



P VALUE 



Nearest Neigh bor (BlastX vs. Non-Redundant Proteins) 



Schizosaccharomyces 
pombe scd2 (scd2) 
gene, complete cds. 



ACCESSION 



DESCRIPTION 



[(U97016) similar to drosophila 



P VALUEI 



0.19 



1938549 



IRIcl gene product ribosomal 
protein L4 (YML4) 
(Ni p:^4S9259) 



619 



Z68327 



Human DNA 
sequence from 
cosmid U25D11, 
between markers 
DXS366 and DXS87 
on chromosome X. 



620 



Dictyostelium 
discoideum 
ORFvegl 14 mRNA, 
U66525 complete cds 



0.19 



3875774 



EMBL:D32434 comes from this 
Jgene; cDNA EST 

EMBL.D33710 comes from this 
Igene; cDNA EST 
|EMBL:D34467 comes from this 

gene; cDNA EST 
|EMBL:D35005 comes from this 
Igene; cDNA EST 
EMBL:D37535 comes from this 
Igene; ... 

>gi|38787 10|gnl|PID|e 1 348373 
EST EMBL:D33710 comes 
from this gene; cDNA EST 
|EMBL:D34467 comes from this 
gene; cDNA EST 
EMBL:D35005 comes from this 
gene; cDNA EST 
EMBL:D37535 comes from this 
[gene; ... 



9e-15 



(AF0561 16) All- 1 related 
019 | J54028I protein [Fugu rubripesl 



6e-15 



621 



U25830 



Newcastle disease 
virus isolate Hens/33 
matrix protein 
mRNA, complete cds 



622 | U89407 



623 I AF095598 



624 I AF064260 



Mus musculus strain 
BALB/c delta- 
aminolevulinic acid 
dehydratase (Lv) 
mRNA, partial cds 



(U93868) RNA polymerase III 
C IS i 2228750 subunit [Homo sapiensl 



Bison bison 
athabascae 
microsatellite BBJ 2 



Strongylocentrotus 
purpuratus SRC8 
mRNA. complete cds 



0.19 



1825764 



0.18 



<NONE> 



0.18 



<NONE> 



(U88314) C46H11.11 gene 
product [Caenorhabditis 
elegans] 



le-18 



<NONE> 



<NONE> 



3e-25 



<NONE> 



<NONE> 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbanlc) 



ACCESSION! DESCRIPTION I pvaMJF 



625 | U6951T 



626 I D89041 



Arabidopsis (haiiana 
AtKAP alpha mRNA 
complete cds 



627 1 M24571 



628 



X59772 



629 I ALO 10209 



63p_|_U67575 



631 | U28730 



-lactis pepFI & 
632 I X99798 p T F9 , 



Bovine DNA tor 
prostaglandin 
F2alpha receptor. 

partial cds 

Dictyostelium 
discoideum iRNA- 
GUi-GAA gene, clone 
yGluGAA7. 
D.melanogaster ovo 
gene required for 
female germ line 
development 
Plasmodium 



falciparum DNA *** 
SEQUENCING IN 
PROGRESS *** 
from contig 3-104, 
complete sequence 
Methanococcus 
jannaschii section 117 
of 150 of the 
complete genome 



Nearest Neighbor (BlastX vs . Non-Redundant Protein./ 



ACCESSION 



DESCRIPTION 



0.18 



<N0NE> 



<NONE> 



0.18 



<NONE> 



0.18 



<NONE> 



<NONE> 



<NONE> 



P VALUE I 



<NONE> 



<NONE> 1 



0.18 



<NONEs 



<NONE> 



<NONE>| 



<NONE> 



Caenorhabditis 
elegans cosmid 
K10B2 



0.18 



<NONE> 



<NONE> 



<NONE>l 



0.18 



111839 



inositol 1,4,5-triphosphate 
[receptor 2 - rat 



8.5 



0.18 



633 I AF025306 



634 | AF05925 1 



635 



636 



Z22605 



AB0I10S6 



Danio rerio band 4.1 
like protein 4 (nbl4) 
mRNA. complete cds 



Mus musculus 
lipoxygenase (alox) 
mRNA. complete cds 



G.domesticus CTCF 
protein mRNA. 



Homo sapiens mRNA 
forKIAA05!4 
protein, complete cds 



(AE000232) orf, hypothetical 
1787604 protein TEscherichia colil 



0.18 



3406624 



KAF0791 10) glycosomal malatr 
'dehydrogenase [Trypanosoma 

brucei] 

(fKOBABLE NUCLEAR'' 



8,3 



0.18 



465445 



IANTIGEN herpesvirus 1 (strain 
jKapIan) >gi|334072 (M34651) 
jORF-3 protein (Pseudorabies 
Ivirusl 



7.9 



0.18 



(Z81368) hypothetical protein 
1655667 lRv2393 



0.18 



481864 



0.18 



.3874158 



3-methyl-2-oxobutanoate 
dehydrogenase 



6.6 



6.6 



(Z81464) predicted using 
Genefinder 



6.4 



2>1<i 



WO 01/02568 
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SEQ 

IP I ACCESSION 



Nearest Neighbor (BlastN vs. OenhnniM 



637 



278536 



638 | U67530 



639 | M637SI 



DESCRIPTION 
Caenorhabditis 



elegans cosmid 
C07A4, complete 
sequence 
[Caenorhabdiiis 
elegans] 



Nearest .xeienh nr (BlastX vs. Non.R e H,.nH™ , ^ ~ 



P VALUE 



.ACCESSION 



DESCRIPTION 



P VALUE! 



0.18 



3702121 



Methanococcus 
jannaschii section 72 
of 150 of the 
complete genome 



Influenza 

A/Duck/England/ 1/62 
(H4N6) nucleoproteinj 
mRNA, complete cds. 



0.18 



3877946 



(AJ01 1681) retinoblastoma- 
Irelated protein [Chenopodium 

rubruml I 5 4 

pFTD54l Weak similarity to 65 ' 
IKDA heat shock protein 
(TR.G602231); cDNA EST 
EMBL:D71705 comes from this | 
jgene; cDNA EST 
[EMBL.D74382 comes from this I 
gene [Caenprhabditis elegans! | 6.3 



0.1S 



640 I M7378I 



641 I X67219 



642 



643 



AF 106941 



Oryctolagus 

cuniculus integrin 

beta-8 subunit 

mRNA, complete cds 
:: gb|I44828|I44828 
Sequence 3 from 
patent US 5635601 



D.melanogaster Rop 
gene ' . ., 



AF052602 



Homo sapiens beta- 
arrestin 2 mRNA. 
complete cds 



3873663 



TZWW4) cDNAfcM 
jEMBL.D71510comes from this | 
Jgene; cDNA EST 
lEMBL:C08449 comes from this | 
Jgene; cDNA EST yk266bl2.3 
Jcomes from this gene; cDNA 
EST yk266bl2.5 comes from 
J this gene; cDNA EST 
yk461h7.3 comes from this 
Igene; cDNA... 



0.18 



Jmajor allergen OLE 1 7 - 
1362129 common nlivp 



G.'18 



(AB01 1527) MEGF1 [Rattus 
3449286 Jnorvegicusl 



6.j 



5.8 



Danio rerio 
huntingtin (HD) 
mRNA, complete cds 



0.18 



548353 



0.18 



241058 



IFKOTEIN-PII] ~ 
URIDYL YLTRANSFERASE 
vinelandii >gi|39257 (X59610) 
uridylyl transferase 



4.8 



potential IGF binding protein 
[chickens. Peptide Partial, 77 aa, 
segment 2 of 3| 



3.7 



3.6 



WO 01/02568 
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1 SEQ 
ID 



644 



Nearest Neigh bor (BlastN vs. C enhnniM 





ACCESSION 



DESCRIPTION 



AB020709 



645 J AF096883 



P VALUE 



Rarest Neighbor (BlastX vs. Non-Red undant Proteins! 



.ACCESSION 



DESCRIPTION 



Homo sapiens mRNAI 
for KIAA0902 
protein, complete cds 



646 [ L3992S 



647 I Ml 70S? 



HIV-l isolate patient 
3 country USA pol 
polyprotein (pol) 
gene, partial cds 



0.18 



3875570 



Pyrocoelia miyalco 
(clone pB-PmL41) 
luciferase mRNA, 
complete cds 



0.18 



3250696 



0.18 



2914702 



Human 

carcinoembryonic 
nonspecific 
crossreacting antigen 
(CEA; NCA) gene, 
exons 1 and 2. 



U.oaj l4) predicted using" 
ijenennaer; cDNA Hifl — 
EMBL:M75775 comes from this 
gene; cDNA EST 
EMBL:M89255 comes from this 
gene; cDNA EST 

EMBL.M89127 comes from this 

gene; cDNA EST 

EMBL.T00141 comes from this 

gene; cDNA EST EMBL.T.. 



P VALUE I 



(AL024486) putative protein I 1.7 



(AC003974) unknown protein 
r Arabidopsis thalianal 



0.73 



0.IS 



648 I X753IS 



_649 I AFQ1190S 



1351833 



H.sapiens ITIH1 gene 
(exon 22) and ITIH3 
gene 



REGULATORY PROTEIN 
ABAA 



I 0.18 



629557 



Mus musculus 
apoptosis associated 
tyrosine kinase 
(AATYK) mRNA; 
complete cds 



0.18 



650 I UQ4004 



651 I U8SHS 



330442 



KNA-binaing protein rnpD 
Arabidopsis thaliana (fragment) 
>gi|5 10240 (X61 108) RNA 
binding protein [Arabidopsis 
thalianal 



Simian 

immunodeficiency 
virus SIVagmVER-2 
envelope protein 
gene, partial cds. 



Xenopus laevis 
RanGTPase 
activating protein 



0.1S 



135102 



0.18 



995714 



1% 



(K03332) nuclear antigen 2 
[Epstei n-Barr virus] 



'ASPAR1 YL-UiNA 

SYNTHETASE aspartate- 
tRNA ligase (EC 6.1.1.12) - 
Escherichia coli coli] 
>gifl 7365 1 3fenI|PID|d 10 J640I 
(D90829) Aspartate-tRNA 
'igase (EC 6.1.1.12) 
Escherichia coli] 



(X9125S) pid:el98503 
iSaccharomyces cerevjsiae] 



2e-ll 



WO 01/02568 



PCT/US00/18374 



SEQ 
ED 



Nearest Neighbor fBlasiN vs. Onh.mH 



ACCESSION 



652 



DESCRIPTION P VALUE 



Z18921 



653 I M60650 



654 I U80912 



655 I AF012899 



656 I AF027174 



IB.oleracea gene for S 
receptor kinasc-like 
protein 



Nearest iNeighbpr CBIastX vs. Non-Redundant Prolelr^T 



ACCESSION 



DESCRIPTION 



|(/oo3Xi; similar to nbokmase" 



P VALUE I 



S.ccrevisiae STA2 
gene, complete cds. 



Eucalyptus globulus 
NADP-isocitrate 
dehydrogenase 
(EglCDH) mRNA, 
complete cds 



0.18 



3875535 



TcbNAkSl EMUL:U6y5J3 
Iconics from this gene; cDNA 
EST EMBL:D65938 comes 
from this gene; cDNA EST 
Iyk280h9.3 comes from this 
jgene; cDNA EST yk280h9.5 
comes from this gene; cDNA 
EST ylc223d 11.3 come... 



0.16 



<NONE> 



<NONE> 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA. complete 
cds 



0.16 



(AF057298) ornithine 
(decarboxylase antizyme 2 [Mus 
3766172 |musculusT 



0.16 



76749 



hypothetical protein 4 - fowl 
{adenovirus 1 



657 | AF030231 



658 I M19183 



Glycine ma.x sucrose 
^synthase (SS) mRNA. 
complete cds 



659 



U31557 



Woodchuck hepatitis 

virus (WHV), 
complete genome, 
clone WHV 59. 
Ovine adenovirus 
IVa2 protein gene, 
DNA polymerase 
gene, terminal protein 
gene and 52.55 kDa 
protein gene, partial 

ds 



le-19 



<NONE> 



4.2 



0.16 



3044086 



(AF055904) unknown 
[Myxococcus xanthusT 



0.078 



0.072 



0.072 



<NONE> 



1076190 



<NQNE> 
cell wall glycoprotein, 75K, 
[precursor - diatom 
(Cylindrotheca fusiformis) 
>gi|5 15363 (X80394) P75K 
gene product [Cylindrotheca 
I fu siformis) 



3511143 



(AF061244) unknown 
|[Agrocybe aegerita] 



0.60 



<NONE> 



6.3 



6.2 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


Neares 

2 

ACCESSIO 


t Neighbor {BlastN vs. 
n| DESCRIPTION 


P VALUE 


nearest fJeisn 
ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 1 
DESCRIPTION |pvaLUe| 


660 


1 elegans cosmid 
1 IY44A6B. complete 
| [sequence 
1 l[Caenorhabdilis 
1 AL02I491 elepansl 


0.070 


<NONE> 


1 <N0NE> 


<NONE> 1 


661 


1 IX.Iaevis Xotch 
1 [protein mRNA. 
I M33874 complete cds. 


0.070 


1654096 


(Y09076) RAD3 
nSchizosaccharomyces pombe] 


0.23 1 


662 


1 Mus musculus 
1 ZAN75 mRNA for 
J zinc finger protein, 
| AB012725 complete cds 


0.069. 


1350800 


MITOCHONDRIAL 
(RIBOSOMAL PROTEIN S5 


2.0 1 


663 


AL021491 


ILaenorhabditis 
[elegans cosmid 

Y44A6B, complete 

sequence 
[[Caenorhabditis 

elegansl 


0.068 


<NONE> 


t <NONE> 


<NONE> 1 


664 


Z60318 


H. sapiens CpG DNA, 
Jclone lei, reverse 
Iread cpelel.rla . 


0.068 


1280134 


(U55376) F16H1 1.2 gene 
product [Caenorhabditis 
elegans] 


2.6 1 


665 


Z35973 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR104w 


0.068 


|i 
|< 

2493000 1 


PKOtlAbLb iULLIN i L- 

COA:3-KETOACID- 
COENZYME A 
TRANSFERASE PRECURSOR 
EMBL:Z 148 16 comes from this 
gene; cDNA EST 
EMBL:Z 14946 comes from this 
gene; cDNA EST 
EMBL:D69746 comes from this 
;ene; cDNA EST yk2 19b6.3 j 
:omes from this gene; cDNA 
ES... j 


0.6S J 


666 I 


Z86111 


Streptomyces lividans 
rpsP, trmD, rplS. 
sipW, sipX, sipY, 
sipZ, mutT genes and 
4 open reading 
Frames 


0.068 


( 

1235974 c 


X967 13) collagen [Globodera 
allida] j 


4e : 04 


667 1 


M729S0 ( 


Anthonomus grandis 
vitellogenin gene 
VTG), complete cds. 


0.068 


( 

r 

r 

A 
A 

3242750 a 


AC005 1 64) match to ESTs } 
iA73U49 (NID:g2140138), 
.A73 1 90S (NlD:g27527 1 9), 
>A287837 (NID:gl9335 19), 
A26281 1 (NID:glS9S3S2), 
id AA825S20 (NID:°2S99 132)1 


le-59 j 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlasiN vs. Genbank) 



.accession! DESCRIPTION 



P VALUE 



668 I M34161 



Rat tachykinin (PPT) 
gene, exons 5 and 6 



669 | LQ381I 



{Aspergillus nigcr zinc 
finger protein (creA) 
Igen e. complete cds 
Hi- "■ ' " 



_670 I M64983 



671 | AFO 14051 



-Juman fibrinogen 
beta chain gene, 
complete mRNA. > 
gb|I47706|I47706 
Sequence 3 from 
patent US 5639940 
[Nicotiana tabacum 
|Mg chelatase subunit 
(ChJH) mRNA 
Jpartial cds 



672 J Y07540 H.sapiens sil gene 



673 I AJ000347 



I Rat t us norvegicus 
I mRNA for 3 , (2 , ),5'- 
[bisphosphate 
(nucleotidase 



675 I X0805O 



ISquid sodium channel 
I mRNA. complete cd s 
Yeast tRNA-Glu(3) 
[gene and flanking 
[regions 



Nearest Neighbor (BlastX vs. Non-Redundant IwT^T 



ACCESSION 



DESCRIPTION 



0.067 



<NONE; 



0.067 



0.067 



0.067 



0.067 



676 | X17115 



iHuman mRNA for 
llgM heavy chain 
[complete sequence 



677 I AF032871 



[Homo sapiens 
uncoupling protein 3 
(UCP3) gene, exon I 

|and partial exon 2 



0.067 



0.067 



0.067 



0.067 



<NONE> 



<NONE> 



P value! 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



[glycoprotein GP330, renal 
0 067 ' 92331 [(fragments) 



rat 



129238 



25 Kb OOKiiNtlh SURFACE 
JaNTIGEN PRECURSOR 
(PRS25) >gi|320962jpir||A44966 
[25k ookinete surface antigen 
[precursor - Plasmodium 
[reichenowi reichenowil 



2128473 
1334398 



1731331 



[hypothetical protein MJO750 • 
[Methanococcus jannaschii 
>gi|1592304 (U67521) 
(ferred oxin-tvpe protein 

|(X15081) MURF2 protein (AA 
11-348) 

riVPOIH^TiCAL5l.6KD 
PROTEIN CY49.14C 
>gi| 1 37024 1 |gnI|PID|e247089 
j(Z73966) hypothetical protein 
|Rv2075c [Mycobacterium 
|tuberculosis] 



<NONE> 



<NONE> 



7.5 



7.4 



1 12900 



"JALPHA-2C-J ADRENERGIC " 
I RECEPTOR human >gi|178l94 
|(J03853) kidney alpha-2- 
ladrenergic receptor [Homo 
sapiens] >gi|1628638 (U72648) 
laIpha2-C4-adrenergic receptor 
I [Homo sapiens] 



1.5 



0.65 



0.51 



0.50 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) ! 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












L) I INAMIfN j (UYISAMIN, 




678 


VAC-l 1 O 


Mouse class II MHC 
E-beta 2 (d) gene 
exon 3 


U.Uo / 


585074 


TESTICULAR) rat 
>gi|39 1 872|gnl|PID|d 1003668 
(D 14076) testicular dynamin 
[Rattus norvegicus] 


3e-04 


679 


AlJUUOJOZ 


Candida albicans 
CaSLNl gene, 
complete cds 


0.067 


t 

3417296 


(AC0030O7) Unknown gene 
product (partial) [Homo sapiens] 


9e-56 


680 


a cty> ! n a 
AJrUZIZjo 


African horse 
sickness virus capsid 
VP3 (L3) mRNA, 
complete cds 


0.066 


<NONE> 


<NONE> 


<NONE> 


681 


AE001507 


Helicobacter pylori, 
strain J99 section 68 
ot 132 or the 
complete genome 


0.066 


<NONE> 


<NONE> 


<NONE> 


682 


AF039717 


Caenorhabditis 
elegans cosmid 
R13H8 


0.066 


<NONE> 


<NONE> 


<NONE> 


683 


AF029027 


Syncerus caffer 
isolate Queen 
Elizabeth Mweya 14 
mitochondrial DNA 
control region 


0.066 


<NONE> 


<NONE> 


<NONE> 


684 


ArUo/yo / 


Homo sapiens full 
length insert cDNA 
clone i U3 


0.066 


2982476 


(X97203) CI protein [Beet curly 
top virus] 


9.5 


685 


J02037 


Baboon endogenous 
virus proviral long 
terminal repeat DNA. 


0.066 


972767 


(L37868) POU-domain 
transcription factor [Homo 
sapiens] 


7.3 


686 


AF000141 


Lycopersicon 
esculentum class I 
knotted-like | 
lomeodomain protein 
(LeT6) mRNA, 
complete cds 


0.066 


3157926 


(AC002L31) Strong similarity to 
extensin-Itke protein gb|Z34465 
'rora Zea mays. [Arabidopsis 
haliana] 


5.6 


687 


AB001746 


Bensingtoma sp. 
DK255 gene for 18S 
■RNA > : : 

ibj|AB001747|AB00 
1747 Bensingtonia 
>p. OK259 gene for 
18S rRNA 


0.066 


3859889 i 


AF070064) cap 'n collar 
soform C [Drosophila 
nelanbgaster] 


0.38 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Prntpin^ 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Helicobacter pylori. 










688 


AE001461 


strain J99 section 22 
of 132 of the 
complete genome 


0.065 


<NONE> 


<NONE> 


<NONE> 


689 


M30821 


Chicken erythroid 
transport proteins c 1 
and c2 


0.065 


<NONE> 


<NONE> 


<NONE> 


690 


AB009802 


Homo sapiens gene 
for osteonidogen, 
intron 3 


0.065 


<NONE> 


<NONE> 


<NONE> 


691 


AF086062 


Homo sapiens full 
length insert cDNA 
clone YZ06B 1 1 


0.065 


<NONE> 


<NONE> 


<NONE> 


692 


AB 002369 


Human mRNA for 
KIAA0371 gene, 
complete cds 


0.065 


2500884 


SIGNAL SEQUENCE 
BINDING PROTEIN binding 
protein [Synechococcus sp.] 


5.5 


693 


AF086864 


Cyclopodia sp. large 
subunit ribosomal 
RNA sene, 
mitochondrial gene 
for mitochondrial 
RNAs. partial 
sequence > :: 
gb|AF086866|AF086 
866 Penicillidia sp. 
large subunit 
ribosomal RNA gene, 
mitochondrial gene 
for mitochondrial 
RNAs. partial 
sequence 


0.065 


3721684 


(AB012957) probable glycosyl 
transferase [Vibrio cholerae] 


5.5 


694 


L44593 ( 


Bacteriophage BK5-T 
ORF410, 3' end pf 
:ds, 20 ORFs. 
repressor protein, and 
Cro repressor protein 
jenes, complete cds, 
DRF70' gene, 5' end 
sf cds. 


0.065 


1172067 


PEPTIDASE T 
AMINOTRIPEPTIDASE) 
nfluenzae Rd] 


3.2 


695 


( 

I 
( 

U80079 c 


-iona iniesiinalis 
vlyoD-family protein 
CiMDFa) mRNA, 
omplete cds 


0.065 


( 

4218110 ( 


AL035353) contains EST 
!b.F152Sl 


2.5 



WO 01/02568 



PCT/US00/18374 





Nearest Nemhbor (BlasrN v< 


Genbank) 




SEQ 
ID 


ACCESSIC 


n| DESCRIPTION 


| P VALUE 


nearest Neip 
ACCESSION 


hbor (BlastX vs. Non-Redundant Protein.: i 1 
DESCRIPTION p v a r r rpl 


696 J 


AB02071i 
AF082137 


Homo sapiens mRN 
for KIAA09II 
_ protein, complete ct 


a| J 

s 0.065 J 1722734 


MHNUK CAPSID PROTEIN L 
_ >gi| 1020 192 type 231 

(U89278) polyhomeotic 2 
homolog [Homo sapiens] 


21 1 
1.9 J 


697 I 


Zea mays copia-like 
retrotransposon Stl- 
14 leader region, 
panial sequence 


1 0.065 I 1877501 


I 11 1 


698 1 X64053 


R.norvegicus ZnBP 
gene for zinc binding 
protein 


J 0.065 j 464963 


TRYPSIN PRECURSOR 


1 0.36 I 


,699 j U67065 


Mus musculus 
butyrophilin (BTN) 
gene, promoter regioi 
_ and complete cds 


i| 1 

1 0.065 3 ?n975i 


hypothetical protein YPL263c - 
_ yeast 


3e-10 1 
1 4e-19 1 


700 1 M64862 


Rat matrin F/G 
ImRNA, complete cds 


j 0.065 I 3420183 


(AF041 105) organic anion 
transporter protein 3 [Rattus 
norvegicus] 


'Ul 1 K02205 


IVeast (ii.ccrevisiae) 
transcriptional 
activator of amino 
acid-biosynthetic 
genes (GCN4) gene, 
complete cds. 


0.064 1 <NONE> 


<NONE> 


<NONE> J 


702 j X58282 


Mai2e mRNA for a 
high mobility group 
protein 


0.064 1 <NONE> 


<NONE> 


<NONE> 1 


1 

703 1 AC001545 , 


Homo sapiens j 
(subclone l_f3 from j 
PI H69) DNA 
>equence 1 


0.064 I cNONE> 


<NONE> 1 




i 

I i 

704 J AF02346I s 


iomo sapiens 1 
TtASB region j 
equence [ 


. 0.064 J <NONE> 


<NONE> | 


<NONE> J 
<NONE>| 


j C 
1 e 

705 I U50307 F 


^aenorhabditis 1 
legans cosmid 
43H9. 


0064 1 <NONE> 


<NONE> 




1 S 
1 h 
1 c 

1 a( 
1 tr 
f P< 

706 1 U46542 cc 


treptococcus crista 1 
[mpA gene, partial 1 
is. putative 1 
ihesin/ABC 
ansport system 1 
otein (scbA) gene, 
>mplctc cds 


| (I 
[ > 

1 fa 

0.064 I 1209391 gr. 


583659) TPR protein pombe] 
gi|2894282|gnl|PID|el251 103 
U.021838) pre-mrna splicing 
ctor. [Schizosaccharomyces 1 
>mbe] 


9.2 




A 

707 | X57564 In 


rusticana mRNA 1 
r neutral peroxidase I 


0.064 1 1492037 


60315) MC094R (Molluscum 
ntagiosum virus subtype I] | 


6.9 









3VJ 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION| DESCRrPTION 



708 I UQ698fi 



709 | D85773 



Human alpha-2- 



P VALUE 



nearest Neighb or (BlastX vs. Non-Redundant Prolelnl) 



ACCESSION 



DESCRIPTION 



macroglobulin 
receptor/lipoprotein 
receptor protein 
(A2MR/LRP) gene, 
exons 39-41. 



.710 | L06178 



Human CpG island 
sequence, clone 
Q28B8 



0.064 



100800 



P VALUE! 



rab!5B protein - wheat 
>gi|2I853 (X62476) rab protein 
[Triticum aestivuml 



0.064 



2245382 



Apis mellifera 
ligustica complete 
mitochondrial 
I genome 



0.064 



3695379 



711 I Y1624?. 



Triticum aestivum 
[mRNA for beta- 
[amylase 



712 



L81779 



IHomo sapiens 
[(subclone 2_a2 from 
PI H25) DNA 



0.064 



1175958 



(U88325) suppressor of 
cytokine signalling- 1 [Mus 
musculus] 

(Aruyoj/U; contains similarity 
to a C. elegans hypothetical 
protein F44G4.1 (GB:249910) 
and several yeast hypothetical 
proteins such as 35. 1 KD 
protein in NAM8-GAR1 
intergenic region (SP:P38805) 
Arabidopsis th alianal 
HiPUlRhllLAL /U.j KU~ 
PROTEIN IN AGP3-DAK3 
INTERGENIC REGION 
>gi|1084712|pir||S56201 
probable membrane protein 
YFL054c - yeast 
(Saccharomyces cerevisiae) 
>gi|836701|gnl|PID|dl009825 
(D50617) YFL054C 



5.3 



5.3 



3.2 



[ sequence 

lureinhardiii psbl 



0.064 



3845169 



713_|X13826 



ImRNA for OEE1 
[protein of 
Iphotosystem fl 
((oxygen-evolving 
[enhancer protein) 



714 | X06487 



IH.sapiens mRNA for 



|bcl2-Ig fusion ge ne 



715 



U79638 



[Mus musculus cyclin- 
dependent kinase 

[inhibitor protein 
(pl5(INK4b)) gene. 

|exon 2 and partial cds 



0.064 



171040 



0.064 



2429362 



(AE00139I) phosphatase (acid 
phosphatase family) 



3.1 



(M94535) ATPase 
[Saccharomyces cerevisiae] 
cerevisiae, Peptide, 377 aa] 
Saccharomyces cerevisiae! 



0.81 



(AF020261) proline rich protein 
Santalum album! 



0.064 



392922 1 



(AF0S2557; TRFl-imeractin° 
[ankyrin-related ADP-ribose 



[polymerase [Homo sapjens] 



0.054 



0.016 



le-10 



WO 01/02568 



PCT/USOO/18374 



Nearest Neighbor (BlastN vs. n.-nh.n nH 

SEQ" 

'D I ACCESSION 



DESCRIPTION 



716 I U39099 



717 I U39673 



718 | AL022317 



Human T cell 



receptor alpha chain 
mRNA. partial cds 



Clostridium 
acetobutylicum KdpC 
(kdpC) gene, partial 
cds. sensor histidine 
kinase homolog 
(kdpD) and response 
regulator homolog 
(kdpE) genes. 
complete cds 



P VALUE 



Nearest Nei ghbor (BlastX vs. Non-Redundant Proteins" 
ACCESSIO N f DESCRIPTION 



0.063 



<N0NE3 



P VALUE! 



<NONE> 



Human UNA 
sequence from clone 
I40L1 on 
chromosome 22qI3.1 
13.31, complete 
sequence [Homo 
sapiens] 



0.063 



<NONE> 



<NONE> 



0.063 



1931640 



719 | U28972 



Spiroplasma citri orfal 
and orff genes, partial | 
cds, orfb, orfc. and 
orfe genes and 
Spiroplasma virus 
SpVl-derived ORF1 
and ORF3 genes, 
complete cds, and 
SpVl-derived ORF14| 
gene, partial cds. 



(U95973) Serine 
Icarboxypeptidase isolog 
[Arabidopsis thalianal 



<NONE> I 



<NONE> I 



5.2 



720 1 U15159 



721 | AF0584I6 



722 I AE0 01430 



Mus musculus limk 
kinase (limk) raRNA 
complete cds 



0.063 



(AF070704) envelope 
glycoprotein [Human 
4091939 immunodeficiency virus type 11 



Homo sapiens 
lipoprotein receptor- 
related protein 
(LRP1), e.xons 39. 40,| 

and 41 

Plasmodium 



0.063 



(AC004S77) sco-spondin-mucin- 
Jlike; similar toP98167 uncertain 
3638957 [Homo sa piens! 



5.2 



5.1 



0.063 



(AE000276) orf, hypothetical 
1788123 [protein [Escherichia coin 



falciparum 
chromosome 2, 
section 67 of 73 of 
the complete 
sequence 



0.063 



2244849 |(Z97337) hypothetical protein 



4.0 



4.0 



WO 01/02568 
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[ACCESSIO N! DESCRIPT ION 
Streptococcus 



723 



L29323 



725 



X7263I 



U17969 



726 I AEOoinnn 



P VALUE I ACCESSION 



pneumoniae methyl 
transferase gene 
cluster, complete 
sequence 



0.063 



3874022 



H.sapiens mRNA 
encoding Rev- 
ErbAalpha > :: 
emb|X72632|HSREV | 
ERB2 H.sapiens 
mRNA encoding Rev-j 
ErbAalpha (internal 
fragment) 

Human initiation 
factor eIF-5A gene, 
complete cds. 



DESCRIPTION 



0203) cDNA EST" 



0.063 



3979878 



EMBUD72339 comes from this 
gene; cDNA EST 
EMBL.D75197 comes from this 

j ?1 e [ Cae "o r "abditis eleg ans) 
A/jiuo; predicted using" 

Genefinder; cDNA EST 
EMBL:T01277 comes from this 
gene; cDNA EST 
EMBL:T01796 comes from this 
gene; cDNA EST 
EMBL:D32545 comes from this 
[gene; cDNA EST 
[EMBL:D33060 comes from this 
jene; cDNA EST EMBL:D... 
(AF025467) contains similarity 
to drosophila DNA-binding 
Iprotein K10 (NID:g8148) 
[Caenorhabditis elegansl 



p value! 



0.063 



727 



S80986 



728 | AF1Q9134 



729 I D87466 



730 | ABO 18269 



731 



D86954 



nuclear 

receptor/retinoid 
signaling modulator 
[zebrafishes, mRNA, 
3876 nt] 



Homo sapiens 7-60 
mRNA. complete cds 



Human mRNA for 
KIAA0276 gene, 
partial cds 



0.063 



1326288 



1083764 



0.063 



2879865 



Homo sapiens mRNA 
for K1AA0726 
protein, complete cds 



Criceiulus eriseus 
mRNA for 
Cytochrome P-450 
14. complete cds 



0.063 



2995865 



0.063 



2496S96 



(AF082486) nef protein [Human 
immunodeficiency virus type 11 



|(U58734) weak similarity to 
lankyrin G [Caenorhabditis 
lelegansl 



[praline-rich proteoglycan 2 
precursor, parotid - rat 
>gi|310200 (LI7318) proline- 
rich proteoglycan [Rattus 
Inorvegicusl 



(AL021816) SPBC24E9.03C, 
unknown, len:25Iaa 
[Schizosaccharomyces pombel 



(AF053455) tetraspan TM4SF 



IfHomo sapiens) 
" HYPOTHETICAL 47.6 RD 

PROTEIN C16C 10.5 IN 

CHROMOSOME III 

>gi|3S743S3|ghl|PID|e 1 344077 

type (RING finger) 
IfCaenorhabditis elegansf 



2.3 



1.7 



1.4 



0.35 



0.093 



6e-05 



2e-16 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbantd 



.Nearest ^e.prmor (BtastX vs. Nnn- Kedundant Pro,.; ~ 
ACCESSION 




WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION P VALUE 



741 I M97695 



742 | U67526 



743 | Z78414 



744 | Y 13606 



Leishmania pifanoi 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



DESCRIPTION 



cysteine proteinase 
(cys2) gene, complete 
cds. 



0.062 



1174754 



Meihanococcus 
jannaschii section 68 
of 150 of the 
comp lete genome 
bl 'aBdi ' 



(Jaenorhabdiiis 
elegans cosmid 
W09D12, complete 
sequence 
[Caenorhabditis 
elegans] 



0.062 



1330345 



TROPOMYOSIN I (TMI) 



(POLYPEPTIDE 49) 
>gi|320989|pir||A60607 
tropomyosin - fluke 



<H l&l coded for by L". 

elegans cDNA yk34bl.5; coded 
for by C. elegans cDNA 
yk!3hl0.5; coded for by C. 
elegans cDNA yk46e8.5; coded 
for by C. elegans cDNA 
yk46d5.5; coded for by C. 
elegans cDNA yk43c2.5; coded 
for by C. elegans cDNA 
yk46e8.... 



P VALUE 



0.018 



0.061 



Mus musculus gene 
encoding filensin, 
exons 6. 7 



<NONE> 



0.061 



2314715 



745 | J04374 



746 I AB022200 



747 |X54250 



748 | X69942 



749 I AJ223206 



750 | Y 10205 



Eggplant mosaic 
virus senome. 



Marine obligately 
oligotrophy 
bacterium POO- 10 
DNA for 16S. 
nbosomal RNA, 
partial sequence 



0.061 



141449 



0.061 



! Rat mRNA for zinc 
finger protein AT- 
BP2. partial cds 



3983593 



0.061 



M. musculus mRNA 
of enhancer- trap- 
locus I 



1377886 



0.061 



Mus musculus mRNAl 
for scrapie responsive| 
protein 1 



2983969 



0.061 



H. sapiens mRNA for 
CDS 8 protein 



4204265 



0.060 



<NONE> 



<NONE> 



(AE000651) H. pylori predicted 
coding region HP 1527 



ingresjo 
riVW'l'HtTKJAL iS.SRb 
PROTEIN IN TRANSPOSON 
TN4556 >gi|80759|pir||JQ0431 
hypothetical 35.5K protein - 
Streptomyces fradiae transposon 
Tn4556 



le-40 



<NONE> 



4.9 



(AB000307) transcarboxylase- 
beta 



(L46815) DNA binding protein 
Rc [Mus musculus] 



(AE00074S) putative protein 
Aquifex aeolicus] 



(AC005223) 45643 
[Arabidopsis thaliana] 



<NONE> 



3.S 



0.9S 



0.57 



5e-31 



<NONTE> 



WO 01/02568 



PCT/US00/18374 



1 SEQ 
ID 



Nearest Neighbor fBlastN vs r»»nh™i^ 



ACCESSION 



DESCRIPTION 



_75l| U79260 



752 | X074S3 



753 I U5750? 



Human clone 23745 
mRNA. complete cds 



P VALUE 



Nearest Neighbor (UlastX vs. Non-Redundant lWin«. 



ACCESSION 



DESCRIPTION 



P value! 



Plasmodium 
falciparum 1 1-1 gene 
part 1 



0.060 



<NONE> 



<NONE> 



^attus norvegicus 
protein tyrosine 
phosphatase delta 
gene, catalytic 
domain, partial cds. 



754 | X68359 



Mfascicularis gene 
for apolipoprotein C- 
III 



755 I X51634 



>"seudomonas braB 
gene for branched 
chain amino acid 
transport carrier (LIV- 
ID 



0.060 



<NONE> 



<NONE> 



0.060 



3452285 



(AF044915) polar tube protein 
IPTP55 precursor 



o.oea 



730843 



SHUTTLE CRAFT PROTEIN 
>gi|487400 



Gossypium hirsutum 
cotton fiber expressed 
protein 2 (CFE2) 
756 1 AF0724ns mRNA, complete cds nnso 



0.059 



1835622 



(U85718) CCML [Pseudomonas 
jputida GB-11 



<NONE> 



<NONE> 



0.28 



2e-04 



,423766 



757 I AF0I2S9Q 



758 I AF093268 



Sambucus nigra 
ribosome inactivating | 
protein precursor 
mRNA. complete cds I 0.056 



[alkaline phosphatase, 145K - 
ISynechococcus sp. 



Rattus norvegicus 
homer- lc mRNA, 
complete cds L 0.054 



2662481 



(AF034859) juvenile hormone 
[resistance protein 



8.1 



4.7 



3.3 



759 | X61046 



760 I AJ0Q58n 



761 | S79S43 



iHydra N-COL 2 
rnRNA for mini- 
collagen, partial cds 



547847 [LECTIN PRECURSOR 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



0.053 



<NONE> 



<NONE> 



0.052 



<NONE> 



<NONE> 



random amplified 
hybridization 
microsatellite 
RAHM) [Beta 
vulgaris=sugar beets. 
Genomic, 537 nt] 



1730145 



GAMETOGENESIS 
EXPRESSED PROTEIN GEG- 
154 >gi|2137331|pir||I48361 
gene GEG- 154 protein - mouse 
>gi|550I23 (X71642) 
|pid.g550123 [Mus musculusl 



7.0 



<NONE> 



<NONE> 



2e-16 



WO 01/02568 



PCT/US00/18374 



Nearest Neigh bor fRI.urM r-.--u.-i., 1 r: -—- . 

S-Genbank) Nearest Ne.ghbor rB. gstX vs. Non-RedunH.^^ T 

ACCESSION! DESCRIPTION | P y AI , llK I 

Mouse mRINfA f'<-ir I I ===== 



762 I ABOOOfWfi 



763 I Z62366 



764 I LI 1670 



Mouse mRNA for 



GATA-2 protein, 
complete cds 



0.023 



H.sapiens CpG DNA. 
clone 67h7, forward 
read cpg67h7.ftla . 
-iuman 



0.023 



transmembrane 
glycoprotein (CD53) 
gene, exons 2 through 



0.023 



765 



Sulculus diversicolor 
DNA for IDO-like 
myoglobin, complete 
D83984 cds 



Is. tuberosum mRNA 
[for inorganic 
■ Iphosphate 
766 I X988 90 transporter. StPTl 



0.023 



0.023 



767 



Dissostichus mawsonil 
Jpreprotrypsin gene, 
U58835 [complete cds 



iGlomus versiforme 
. Jchitin synthase gene 

768 J AJ0096 30 (clone Gvchs3) 



0.022 



Human glucagon 
J04040 mRNA. complete cds. 



770 



JL.esculentum Asr3 
_X74908 °ene 



0.022 



0.022 



0.022 



771 



[Shigella dysenteriae 
JO-antigen 
polysaccharide 
[biosynthesis rfbX. O- 
lantigen polymerase 
j(rfc), rhamnosyl 
Itranferase I and II 
(rfbR and rfbQ) and 
rfbD genes, complete 
_L07293 Icds. 



0.022 



<NONE> 



3123312 



80636 



3114665 



DESCRIPTION 



p value! 



<NONE> 



ZINC FINGER PROTEIN 142 - 
(KIAA0236) to Human zinc 
finger protein(ZNF142) [Homo 
sapiens! 



hypothetical 67K protein - 
Mycobacterium fortuitum 
plasmid pAL5000 >gi| 149986 
(M60875) ORF2 



(AF061267) inner membrane 
component HtxE [Pseudomonas 
stutzeri] 



<NONE>| 



5.9 



3.4 



683532 



(X02155) thyroglobulin [Bos 
taurus] 



<NONE> 



<NONE> 



<NQNE> 



<NONE> 



<NONE> 



<NONE> 
<NONE> 



3.4 



1.1 



<NONE> 



<NONE> 



<NONE> 
<NONE> I 



<NONE>| 



. <NONE> 



<NON"E> 1 



sq^ 



WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/US00/18374 



SEQ 



Nearest Neighbor (BlastN vs. Penh.™!^ 



ID I ACCESSION DF.SrprpTTOM 



nearest Neigh bor (BlastX vs. Non- Redundant Protein's" 



781 



M10217 



782 I M55147 



783 



X58839 



Xenopus laevis 



P VALUE I ACCRS.STnM 



mitochondrial DNA. 
complete genome. 



0.022 



21457.63 



DESCRIPTION 



P VALUE! 



B2168_C2_205 protein 
Mycobacterium leprae 



Pea chloroplast 
glyceraldehyde-3- 
phosphate 
dehydrogenase 
(Gpbl) gene, 
complete cds. 



Acholeplasma virus 
MV-L1 DNA for 
omplete circular 
genome 



0.022 



417308 



0.022 



3273189 



784 I M26185 



Mouse c-myb 
[oncogene, exon 1 and | 
exon 2 (partiall. 



PROBABLE HELICASE 
MOT1 Motlp is a probable 
hclicase essential for vegetative 
growth on rich glucose medium 
at 30 degree C: Swiss-Prot 
Accession number P32333; 
similar to S. cerevisiae RAD26 
gene product: Swiss-Prot 
Accession number P40352 



7.3 



(AB008/5/)subunitIIof 
c(o/b)3-typc cytochrome c 
oxidase [Bacillus 
stea rothermpphilusl 



785 | AF061I95 



IStreptomyces albus 
I valine dehydrogenase 
J(Vdh) gene, complete | 
Icds 



0.022 



0.022 



138592 



2088768 



iHomo sapiens alpha 
1,2-mannosidase IB 



786 I AF053 622 gene, exon 9 



787 | 271500 



|S.cerevisiae 
chromosome XIV 
reading frame ORE 
YNL224c 



0.022 



1352361 



0.022 



788 | D 1047 1 



'Herpes simplex virus 
type 2 genomic DNA 
for 0.74-0.84 region, 
[complete cds 



789 I U43082 



Izea mays T 
(cytoplasm male 
{sterility restorer 
(factor 2 (rf2) mRNA 
| complete cds 



0.022 



1708875 



3132276 



0.022 



3319720 



VllfcLLOUtNlNI 
PRECURSOR (YOLK 
PROTEIN 1) 

gi|72270|pir||VJFFl 
vitellogenin I precursor 
unnamed protein product 
fDrosophila melanbgaster] 



(AF003I45) B0414.8 gene 
product [Caenorhabditis 
elesansj 



fcAKLY GROWTH 

RESPONSE PROTEIN 1 fish 
>gi|53I456 (U12895) egrl 
Danio rerio] reriol 



PUTATIVE TUMOR 
SUPPRESSOR LUCA15 
sapiensl 



(AB01 14S6) short ORF [TT 
virusl 



(AL031035) putative aldehyde 
dehydrogenase [Streptomyces 
coelicolbr] 



4.2 



2.5 



0.86 



0.36 



0.16 



0.13 



0.0 1 1 



WO 01/02568 



PCT/US00/18374 




WO 01/02568 



PCT/USOO/18374 



SEQ 



Nearest Neighbor (BlastN vs. Genbank) 



ID I ACCESS TON I DESCRIPTION | p VALUE I ACCESS ION 
s [dopamine D2 * ~ ~ ' 



! receptor [human, 
brain. Genomic, 3794 
nt, segment 4 of 51 



802 I M6Q57? 



803 I AF045654 



J04_|m69023 



Rat nerve growth 
factor-inducible 
protein (VGF) gene, 
complete cds. 



Callus gallus 
neuregulin beta-la 
mRNA, complete cds 



Human globin gene 



805 I Z65960 



806 | X97073 



J07 I X5649I 



H.sapiens CpG DNA, 
clone 69d2, reverse 
readcp.g69d2.rtlb. j 0.021 



A.oligospora gene 
encoding lectin 



D. melahogaster 
mRNA for gene 
containing opa 
repetitive element 



808 I L78760 



809 | AB007864 



Homo sapiens 
(subclone l_f6 from 
PI H3 1 ) DNA 
sequence 



_SJ0_jAL02i932 



Homo sapiens 
KIAA0404 mRNA, 
partial cds 



Mycobacterium 
tuberculosis H37Rv 
complete genome; 
segment 22/162 




Nearest Neith er (BlastX vs. Non-Redundant PToleTnl) 
DESCRrPTION 



P VALUE! 



j(YU8029)NAD(P)( + )-arginine' 
ADP- ribosyl transferase 
[Oryctolagus cuniculusl I 5 j 



f (AF030050) replication factor C 
[[Rattus norvegicusl I 3 j 



1A% 



(AF040647) No definition line 
found fCaenorhabditis eleeansl 
(Z82056) T26H5.8 * 



[Caenorhabditis elegans] 
>gi|3880787|gnI|PID|e 1 350288 
(AL032620) T26H5.8 I 2 .4 

l(U80845) similar to family I of 
|G-protein coupled receptors 
[Caenorhabditis eleeansl I 0 79 

CURE ANTIGEN ' 

[>gi|73601|pir||NKVLC2 core 
Jantigen - woodchuck hepatitis 
'virus2>gi|336135 



0.47 



HOMEOBOX PROTEIN DLX- 
17 >gi|1620520 



0.16 



!!!! ALU CLASS F WARNING 
[ENTRY!!!! nis 
MMEll'jbMmHA.bhAlU- 1 
ACETYLSERINE 
SULFHYDRYLASE A) (O- 
I ACETYLS E R INE (THIOL)- 
LYASE A) (CSASE A) 
>gi|68323|pir||S YEBAC cysteine] 
[synthase (EC 4.2.99.8) A - 
Salmonella typhimurium 
|>gi|153935 (M21450)cysK 
protein [Salmonella 

Ityphimuriuml I 0 1 



(AL021932) hypothetical 
'protein RvQ439c 



7e-I0 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEC 
ID 


ACCESSIOf 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















811 


U8999 1 


Hypocrea jecorina • 
mannose- 1 -phosphate 
guanylyltransferase 
(MPGl)mRNA, 
complete cds 


0.021 


3581924 


(AL031538) mannose- 1- 
phosphate guanyltransferase 
. [Schizosaccharomyces pombe] 


6e-20 


812 


X00641 


Sugar beet 
mitochondrial 
minicircle pO 
sequence 


0.020 


<NONE> 


<NONE> 


<NONE> 


813 


Z50097 


D.melanogaster 
mRNA for hdc 
protein. 


0.020 


<NONE> 


<NONE> 


<NONE> 


814 


AF044866 


Phoebis sennae large 
subunit nbosomal 
RNA gene, partial 
sequence; tRNA-Val 
gene, complete 
sequence; and small 
subunit nbosomal 
RNA gene, partial 
sequence, 

mitochondrial genes 
for mitochondrial 
RNAs 


0.020 


<NONE> 


<NONE> 


<NONE> 


815 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


0.020 


<NONE> 


<NONE> 


<NONE> 


816 


AF027174 ( 


Arabidopsis thaliana 
:eIluIose synthase 

^ntnlvtif* cnhiiinit f Arfi- 

>ULUI V LtV. 3UUU1III ^J^UI 

B) mRNA, complete 
:ds 


0.020 


<NONE> 


<NONE> 


<NONE> 


817 


1 
1 
c 

s 
t 

AE001405 s 


-"lasmodium 
alciparum 
hromosome 2, 
ection 42 of 73 of 
le complete 
equence 


0.020 


( 
1 

2196776 r 


AF003342) bunched gene 
iroduct [Drosophila 
nelanogaster] 


8.4 


818 


S 
h 

AF074387 n 


ambucus nigra 
cvein-like protein 
iRNA. complete cds 


0.020 


h 

627071 F 


istidine-rich protein - 
lasmodium lophurae 


2.S 



i 

WO 01/02568 PCT/US00/18374 



SEQ 
ID 


jl Nearest 
1 ACCESSION* 


Neiehbor (BlastN vs. < 
1 DESCRIPTION 


jenbank) 
P VALUE 


Nearest Neigh 
I ACCESSION 


bor (BlastX v S . Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


819 


Y13304 


Hylobates hoolock 
mitochondrial DNA 
for cytb gene, Horace 


0.020 


285580 


(D 10043) ORF [Acetobacter 
pasteurianus] 


2.1 


820 


Z66539 


H. sapiens creatine 
transporter gene 


0.020 


1703594 


lUHlWjyj coded tor by L'. 

elegans cDNA yk7c8.5; coded 
for by C. elegans cDNA 
ykI33b3.5; coded for by C. 
elegans cDNA yk65a4.5; coded 
for by C. elegans cDNA 
yk7c8.3; coded for by C. 
elegans cDNA CEESQ66F; 
coded for by C. elegans cDNA 
yk65a4.3;... 


0.98 


821 


AF053622 


Homo sapiens alpha 
1,2-mannosidase IB 
gene, exon 9 


0.020 


1352361 


EARLY GROWTH 
RESPONSE PROTEIN 1 fish 
>gi|531456 (U 12895) egrl 
Danio rerio] rerio] 


0.72 


822 


( 

M20555 t 


Human MHC class II 
HLA-DRw53-beta 
DR4,w4) gene, 
;xons 2,3,4,5.6. 


0.020 


1 

465569 < 


HlHUlHtl IL.AL J». 1 KL> 

PROTEIN IN SBCB-HISL 
INTERGENIC REGION 
>gi|405956 (U00009) 
ORF_ID:o349#4; similar to 
[SwissProt Accession Number 
P33015] [Escherichia coli] 
>gi|1736693|gnl|PID|d 1016570 
Number P330 15] [Escherichia 
:oli] >gi| 1788323 (AE000292) 
autative transport system 
aermease protein [Escherichia 
:oli] 


0.43 


823 


1 
1 

( 

M20555 e 


4uman MHC class II 
-ILA-DRw53-beta 
DR4,w4) gene, 
xons 2,3,4,5,6. 


0.020 


( 
< 

s 

1709751 f 


COENZYME PQQ 
SYNTHESIS PROTEIN F 
ynthesis F - Pseudomonas 
luorescens >ei|929802 


0.42 



q>0D 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. (VnhnnH 



SEQ I | 1 1 Rarest iNemhbo r (BlastX vs. Non-Redundant Pr^n. . 

*p I accession! description Lpvalue, accession 



824 I AJ0Q5015 



825 | AF034099 



Homo sapiens mRNA I 
for putative SMC-likel 
protein, partial 



0.020 



267449 



Laccaria bicolor 
glyoxal malate 
synthase protein 
mRNA. complete cds 



DESCRIPTION 



p value! 



chromosome hi 

>gi|102507|pir/|S15787 
hypothetical protein 1 (cosmid 
ZK637) - Caenorhabditis 
elegans Genefinder; cDNA EST | 
yk217b5.3 comes from this 
gene; cDNA EST yk2 1 7b5.5 
comes from this gene; cDNA 
EST yk340gI2.3 comes from 
this gene; cDNA EST 
yk340gl2.5 comes from this 
gene; cDNA EST yk428c5.5 

CO... 



le-12 



826 1 AF1 00694 



827 I AF093268 



Mus musculus 
Pontin52 mRNA 
complete cds 



0.020 



1 109847 



Rattus norvegicus 
homer- 1c mRNA, 
complete cds 



0.019 



132836 



(U41538) No definition line 
found fCaenorhabditis ele 



828 I AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



0.019 



2633401 



0.019 



2492604 



60S R1BOSOMAL PROTEIN 
L28 protein L28 [Rattus 
norvegicusl 



5.7 



(Z99109) similar to DNA 
exonuclease 



4.5 



MULTIDRUG RESISTANCE 
PROTEIN CDR2 albicansl 



829 



U67538 



830 | U56088 



Methanococcus 
jannaschii section 80 
of 150 of the 
complete genome 



Human periodic 
tryptophan protein 2 
(PWP2) gene, exons 
3 to 14 



831 



U76524 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



0.019 



1723566 



0.019 



2144804 



0.018 



1916976 



TU1 ATivt"- 

GLUCOSYLTRANSFERASE 
CI7C9.07 

>gi| 1 3 1 4 1 59|gnl|PID|e24 1 760 
(Z73099)SPAC17C9.07, 
putative glucosyl transferase Ien 
501, similar to 
SW:ALG8_ YEAST P4035 1 
glucosyltransferase ALG8 
pom be] 



collagen alpha 1(11) chain 
bovine 



4.4 



2.7 



(U9I682) vitelline membrane 
protein homolog [Aedes 
aegypti] 



0.040 



7.2 



tf6 ' 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins^ 


SEQ 
ID 


ACCESSIOt" 


< DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















832 


AF026258 


Onobrychis viciitblia 
chalcone synthase 
(CHS) mRNA, 
complete cds 


0.018 


763076 


(Z48799) ZP3 [Cyprinus carpio 
>gi|777724 (L41637) egg 

carpio] 


1 

5.2 


833 


U95094 


Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 


0.009 


3955011 


(AJ005438) beta adrenoreceptoi 
B 


0.60 


834 


X71603 


(-'.jejuni VS1 UNA > 

emb|A39603|A39603 
Sequence 2 from 
Patent W094 17205 > 
:: gb|I76090|I76090 
Sequence 2 from 
patent US 5691 138 


0.008 


<NONE> 


<NONE> 


<NONE> 


835 


AF093268 


Rattus norvegicus 
homer- lc mRNA, 
complete cds 


0.008 


138116 


HEAD FIBER PROTEIN 
(LATE PROTEIN GP8.5) 

o \ - | o^vj|pii || ™ ivixir on gene 
8.5 protein - phage PZA 
>gi|216057 (Ml 1813) head 
fiber protein 


8.1 


836 


X91751 


Bovine herpesvirus 
type 1 UL7 gene 


0.008 


1711436 


SUPEROXIDE DISMIJT-V^F 
(FE) 1.15.1.1) (Fe)- 

i acuUUlIIUIldb u. CI LI £1 1 1 1 Uo u 

>2i|409767 


5.9 


837 


M95594 


Arabidopsis thaliana 
1-aminocyclopropane- 
1-carboxylate 
synthase (ACS2) 
gene, complete cds. 


0.008 


683698 


(Z48229) orf 1 gene product 
[Saccharomyces cerevisiael 


le-06 


838 


U67465 ( 


vlethanococcus 
annaschii section 7 
3f 150 of the 
:omplete genome 


0.008 


( 

3874664 ( 


Z68493) predicted using 
jenefinder 


le-07 


839 


X72388 ( 


3.taurus mRNA for 
llensin 


O.OOS 


100174 c 


l-aminoeyclopropane- 1- 
•arboxylate synthase 


7e-09 


840 


I 

f 

U22398 r 


luman Cdk-inhibitor 
)57KIP2 (KIP2) 
nRNA. complete cds. 


0.00S 


( 

2228750 s 


(J93868) RNA polymerase III 
ubunit [Homo sapiens) 


2e-lS 


841 


j 
i 

L42546 p 


(enopus laevis LIM 
lass homeodomain 
rotein 


0.007 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT7US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiehbor (BiastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










842 


AF041428 


ribosomal protein s4 
X isoform gene, 
complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


843 


AF000227 


Secale cereale omega 
secalin gene, 
complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


844 


D86254 


Human MHC (HLA) 
DRB intron 1 DNA. 
partial sequence 


0.007 


<NONE> 


<NONE> 


<NONE> 


845 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


846 


Y07738 


M.musculus gene for 
vimentin 


0.007 


<NONE> 


<NONE> 


<NONE> 


847 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.007 


<NONE> 


<NONE> 


<NONE> 


848 


AF055119 


Homo sapiens alpha- 
tectorin (TECTA) 
gene, exon 6 


0.007 


<NONE> 


<NONE> 


<NONE> 


849 


M61195 


Zucchini 1- 

ami nocyclopropane- 1 ■ 

carboxylate synthase 


0.007 


<NONE> 


<NONE> 


<NONE> 


850 


Y 11050 


Homo sapiens DSG3 
gene, partial intron 
and partial exon 6, 
140 bp 


0.007 


<NONE> 


<NONE> 


<NONE> 


851 


X61204 


M.vollae vhuD, 
vhuG, vhuA, vhuli & 
vhuB genes 


0.007 


cNONE> 


<NONE> 


<NONE> 


852 


AB012105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.007 


<NONE> 


<NONE> 


<NONE> 


853 


S43S82 


telomere: 

{ minichromosome, 
repeats ) 

Trypanosoma brucei. 
Genomic, 1 !70 ntl 


0.007 


<NONE> 


<NONE> 


<NONE> 



405 
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Nearest Neiahbor (BlastN vs. Genbank) 


) Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















854 


L32674 


Geomydoecus nadlen 
mitochondrial 
cytochrome oxidase I 
gene, partial cds. 


0.007 


<NONE> 






855 


U58732 


Caenorhabditis 
elegans cosmid 
F4SD6. 


0.007 


<NONE> 


<NONE> 


<NONE> 


856 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


0.007. 


<NONE> 


<NONE> 


<NONE> 


857 


Z35284 


H.sapiens mRNA for 
MDR3 P- 
glycoprotein 


0.007 


1730696 


HYPOTHETICAL 121.1 1Kb 

INTERGENIC REGION 
PRECURSOR YNR067c - yeast 
(Saccharomyces cerevisiae) 


9.5 


858 


X15217 


Human sno oncogene 
mRNA for snoA 
protein, ski-related 


0.007 


902455 


(U24203) membrane protein 
(Escherichia colil 


8.8 


859 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.007 


1684636 


(Y09454) ORF3 [Lactobacillus 
casei bacteriophage A21 


8.3 


860 


AF012899 


Sambucus nigra 
ribosome inactivating 
jrotein precursor 
mRNA. complete cds 


0.007 


3878803 


(Z48795).R05H5.7 
Caenorhabditis elegans] 


8.3 


861 


( 

S76317 s 


Iiy=ISU-2TJ0kda 
membrane protein 
scavenger receptor 
lomolog {clone 18, 
ntron and flanking 
:xons 14 and 15} 
sheep, lymph node, 
ymphocytes, 
genomic. 30S nt, 
egment 2 of 2] 


0.007 


( 

294747 | 


L08174) ORF2 
Romanomermis culicivorax] 


7.4 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



862 



D88084 



P VALUE 



Pcdicularis 
venicillata 
chloroplast DNA, 
intergenic region 
between trnT(UGU) 
and tmL(UAA)5'exon 



863 



Chicken mRNA for 
(aldehyde 
X58869 [dehydrogenase 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



0 007 



0.007 



2555187 



1 15978 



DESCRIPTION 



P VALUEl 



(AF026789) vitellogenin 
Pimpla nipponical 



CD30L RECEPTOR 
PRECURSOR 
(LYMPHOCYTE 
ACTIVATION ANTIGEN 



6.9 



864 



Homo sapiens mRNA 
for GS3786, complete 
_D87120 cds 



0.007 



3879589 



JUU/ JJ 1 rOITTTC Mill UUMIdlll, 

cDNA EST EMBL:D35637 
comes from this gene; cDNA 
EST yk322a3.5 comes from thi 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bl 1.5 comes from 
this gene; cDNA EST 
yk397b2.3 comes fr... 

>gi|3880965|gnl|PID|el350578 
comes from this gene; cDNA 
EST yk322a3.5 comes from this 
gene; cDNA EST yk397b2.5 
comes from this gene; cDNA 
EST yk348bl 1.5 comes from 
this gene; cDNA EST 
yk397b2.3 comes 



865 1 X68793 



H.sapiens gene for 
lanti thrombin III 



0.007 



2358285 



(AF0 10403) ALR [Homo 
sapiens] 
TY 



Danio rerio mRNA 
for opioid receptor 
866 I AJ001596 Ihomologuc 



0.007 



2507509 



HVPOlHJbllCAL^y.X KI5 
PROTEIN IN HOLB-PTSG 
INTERGENIC REGION 
>gi| 1787342 (AE000210) orf, 
hypothetical protein 
[Escherichia coli] protein in 
holB 3'regton . [Escherichia 
colil 



867 



iStreptomyces albus 
[valine dehydrogenase 
l(Vdh) gene, complete 
AF061 195 cds 



868 



lArabidopsis thaliana 
mRNA for 

Ineoxanthin cleavage 
AJ0 O58I3 lenzyme 



0.007 



208S76S 



(AF003145) B0414.8 gene 
product [Caenorhabditis 
elegans] 



0.007 



171Q1Q5 



UDP-N- 
ACETYLGLUCOS AMINE 2- 
EPIMERASE UDP-N- 
acetylglucosamine 2-epimerase 
[Plasmid p\VQ7991 



Ho 5 



1.9 



1.7 



r 

WO 01/02568 PCT/US00/18374 



SEC 
ID 


Nearest 

) 

ACCESSIOr 


Neighbor (BlastN vs. 
* DESCRIPTION 


Gen bank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


rote ins) 
P VALUE 


869 


L03398 


Zebrafish retinoic 
acid receptor alpha 
2.A 


0.007 


2239219 


_ (Z97210) hypothetical protein 


0.77 


870 


D63484 


Human mRNA for 
KIAA0150gene, 
partial cds 


0.007 


19917 


(Z14014) Pistil extensin like 
piuiciii, paxiiQi v^L/o only 


0.61 


871 


M31483 


Maize glyceraldehyd* 
3-phosphate 
dehydrogenase, 3' 
end. 


0.007 


543068 


milfl n frn/*hA-r»Hrr\n/*h ii 1 iAr\ct 

umi.111, u unicuurunLiiidi - OOE 

>gi|402558 


0.45 


872 


AF0901 15 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP 17.4) mRNA, 
complete cds 


0.007 


2494941 


ALPHA-2B ADRENERGIC 
RECEPTOR adrenoceptor 

>gi|1587159|prf||2206293B 
adrenoceptor alpha2B [Cavia 
porcellus] 


0.42 


873 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA. 
complete cds 


0.007 


1110587 


/ y*tiu^ nuciear localization ° 
signals Peptide, 140 aa] [Mus 
sp.l 


0.26 


874 


X88931 


H.sapiens PAL2A 
gene 


0.007 


1706176 


CUTINASE TRANSCRIPTION 
FACTOR 1 ALPHA 
>gi|1262912 (U5167I) cutinase 
transcription factor I [Fusarium 
solani f. sp. pisi] 


0.2! 


875 


S74155 


zRAR alpha =retinoic 
acid receptor alpha 
'zebrafish, embryos, 
mRNA. 1773 nt] 1 


0.007 


2239219 


(Z97210) hypothetical protein 


0.1 1 


876 


] 
1 

M74193 i 


'etromyzon marinus 
jlasma albumin 
■nRNA. complete cds. 


0.007 


( 

730888 I 


DCTAPEPTIDE-REPEAT 
PROTEIN T2 


0.011 


877 


< 

c 
( 

U03673 c 


Saccharomyces 
erevisiae Spp41p 
SPP41) gene, 
omplete cds. 


0.007 


( 

38208S5 r 


AL033126) 65G3.k 
Drosophila melanosaster] 


0.001 


878 


I 

f 

D37766 c 


lomo sapiens mRNA 
or Laminin-5 bcta3 
(lain, complete cds 


0.007 


( 

1235974 p 


X96713) collagen [Globodera 
allidal 


3e-06 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins! 



SEQ 
ID 



ACCESSION I DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



ICaenorfiabditis 



879 I AF022388 



lelegans putative 
I transcription factor 
|MAB-3 (mab-3) 
[gene, complete cds 
lAcanthamoeba 



880 



Jcastellanii 
Itransformation- 
Isensitive protein 
Ihomolog mRNA, 
U89984 {complete cds 



l( AF09574 1 ) unknown [Rattus 
0.007 | 3747107 |norvegicus] 



(US9984) transformation- 
0.007 I 1890 281 [sensitive protein homolog 



881 I AB 020689 



Homo sapiens mRNA 
for KIAA0882 
protein, panial cds 



882 



jMus musculus 
[Pontin52 mRNA, 
AF 100694 complete cds 



883 | AF027173 



884 



U76524 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



0.007 



3880809 



0.006 



<NONE> 



0.006 



<NONE> 



0.006 



<NONE> 



m U-U,l t b-)f JIIIUIUI LU I fuuuun. 

rabGAP domains; cDNA EST 

EMBL:D34945 comes from this 
Igene; cDNA EST 

EMBL:D273I3 comes from this 

gene; cDNA EST 

EMBL:D34829 comes from this 
Igene; cDNA EST 
EMBL:D27312 comes from this 
gene; cDNA ... Probable 
rabGAP domains; cDNA EST 
EMBL:D34945 comes from this 
gene; cDNA EST 
EMBL:D27313 comes from this 
Igene; cDNA EST 
EMBL:D34829 comes from this 
Igene; cDNA EST 
EMBL:D27312 comes from this 
gene; cDNA ... 



5e-09 



2e-09 



<NONE> 



<NONE> 



<NONE> 



Ie-23 



<NONE> 



<NONE> 



<NON"E> 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genhnnl^ 




WO 01/02568 



PCT/US00/18374 



SEQ 
10 I ACCESSION 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteino 



DESCRIPTION 



P VALUE I ACCESSION 



DESCRIPTION 



897 



(Arabidopsls thaliana 
(cellulose synthase 
(catalytic subunit (Ath-| 
IB) mRNA. complete 
AFQ27174 cds 



Borrelia burgdorferi 
(section 34 of 70) of 

898 I AEQ01 148 the complete genome 

Arabidopsis thaliana 
cellulose synthase 
(catalytic subunit (Ath-, 
|A) mRNA. complete 

899 I AF027173 cds 



0.003 



0.003 



0.003 



Lycopersicon 
esculentum class II 
small heat shock 
[protein Le-HSP 17.6 
900 I U72396 I mRNA. complete cds 



'Mus musculus 
, JPontin52 mRNA 

901 j API 00694 complete cds 

KJhlamydomonas 



0.002 



0.002 



reinhardtii light 
harvesting complex III 
Iprotein precursor 
(Lhcb3) mRNA, 
902 I AF 104631 complete cd^ 



903 



IMus musculus 
(Pontin52 mRNA, 
AF1QQ694 Icomplete cds 



0.002 



904 



lUrassica rapa mRNA 
(for SRK45, complete | 
AB012106 cds ' 



0.002 



905 



(Human non-histone 
(chromosomal protein 
HMG-14 gene, 
M21339 complete cds. 



0.002 



0.0O2 



906 I AF012899 



Sambucus nigra 
ribosome inactivating j 
protein precursor 
mRNA, complete cds 



0.002 



<NONE> 



4160388 



1709213 



<NONE> 



<NONE> 



<NONE; 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



P value! 



<NONE> 

(AJ0 11856) ORFQ0255 
'Saccharomyces ccrevisiael 

NUCLEAR ENVELOPE PORE 
MEMBRANE PROTEIN POM 
121 (PORE MEMBRANE 
PROTEIN OF 121 KD) (P145) 



<NONE> 



1.5 



<NONE> 



<NONE> 



<NONE> I 



<NONE> I 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 
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I SEQ 
ID 



Nearest Neighbor (BlastN vs. Genh.-mlr " 



907 



908 



909 



910 



911 



ACCESSION DESCRI PTION p VALIJF I 4 rr«c,nx, 
PVAMIF ACCESSION | DESCRgriON 



[Human h-lys gene for 
llysozyme (upstream 
X57103 Iregion) 



0.002 



ISambucus nigra 
jhevein-like protein 
_AF074386 I mRNA. complete cds I 0,002 



Human CD4 
[promoter, partial 
UP 1066 [sequence. 



<NONE> 



<NONE> 



[Barley mRNA 
L28094 sequence 



0.002 



<NONE> 



jjlomo sapiens DNA 
from chromosome 19- 
cosmid fl9399(-17 
kb EcoRI restriction 
AD0OQ833 fragment) 



0.002 



<NONE> 



0.002 



Homo sapiens TRHR 
Igene promoter and 
912 1 AJ0I1 7Q1 exons l-2.p .-.rfi..l 



[Mus musculus 
|Pomin52 mRNA, 

913 1 AF 1QQ694 complete cds 

[Homo sapiens retinol 
[dehydrogenase aene, 

914 j AF037Q62 complete crf« 
Rattus norvegicus 



0.002 



0.002 



915 J AF09326S 



917 I AF02717-1 



918 



Z46736 



_£19jAB012106 



homer- 1c mRNA, 
complete cds 



Methanococcus 
jannaschii section 150; 
of 150 of the 
complete genome 

Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA. complete 
cds 



0.002 



0.002 



H.sapiens DNA for 
repeat region (ABM- 
C82) 



Brassica rapa mRNA 
forSRK45. complete 
cds 



0.002 



0.002 



0.002 



0.002 



f{0 



<NONE> 



<NONE> 



<NONE> 



<NONE> 
<NONE> 



<NONE> 



<NONE> 



<NONEi 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



_<NONE> 



<NONE> 



<NONE> 
<NONE> 



<NONE> 



<NONE> 



<NONE> 



cNONEj 



p value! 



<NONE>l 



<NONE> 



<NONE>| 



I <NON"E> I 



|<NONE> 



I <NONE>| 



I <NONE> 



1 <NONE> I 



<NONE> I 



I <NOiVE> i 



<NONE> i 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






X.Iaevis mRNA for 










920 


285983 


NOVA protein 


0.002 


<NONE> 


<NONE> 


<NONE> 


921 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA. complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


922 


S61977 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, inrron 10} 
[human. Genomic, 
1407 nt] 


0.002 


<NONE> 


<NONE> 


<NONE> 


923 


AJ005813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


0.002 


<NONE> 


<NONE> ■ 


<NONE> 


924 


AB012105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


925 


AB0I2106 


Brassica rapa mRNA 
for SRK45, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 


926 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


0.002 


<NONE> 


<NONE> 


<NONE> 




. X51646 


H.sapiens DNA for 
dopamine D2 
receptor gene ' 


0.002 


3329125 


(AE001337) YopC/Gen 
Secretion Protein D [Chlamydia 
trachomatis] 


9.5 


928 


AF 100694 


vlus musculus 
Pontin52 mRNA, 
:omplete cds 


0.002 


465762 


H V FU I Ht 1 HJ AT 1 1 2. TRU 

rlv.U i ElIN ^(JOO^-l UN 

CHROMOSOME III 
>gi|630524|pir||S44748 
C06G4.1 protein - 
Caenorhabditis elegans 
>gi|409292 (L25598) homology 
with vigilin; coded for by C. 
slegans cDNA 

GenBank:M8S954 (CEL12C9); 
autative [Caenorhabditis 


8.9 


929 


U4S47S 


Human skeletal 
nusclc ryanodine 
eceptor gene 


0.002 


2137221 


;o-repressor protein - mouse 
>gi|642619 


6.9 



WO 01/02568 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins'! 


SEQ 
ID 


ACCESSIOf> 


r DESCRIPTION 


P VALUE 


ACCESSION 




r VALUE 






Mus muscuius 










930 


AF 100694 


Pontin52 mRNA, 

l_ KJ ill yj 1 c~ 1 1_ LU 3 




BUCO JO 


(Z22520) membrane protein 
[Bacillus acidopullulvticus] 


6.3 


931 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 

pfifTinlpfp f*rfc 


CI CMY7 




(AL023844) Y48A6B.1 

f „ _| i J"' ■ « 

[Caenorhabditis eleeans] 


5.8 


932 


AF090I15 


Lycopcrsicon 
esculentum cytosolic 
class II small heat 
shock orotein HCT2 
(HSP17.4) mRNA, 
complete cds 


0.002 


3878330 


(Z81097) K07A1.4 
[Caenorhabditis eleeans] 


4.8 


933 


AF093268 


Rattus norvegicus 
complete cds 


0.002 


137640 


REPLICATION PROTEIN El 
papillomavirus 


4.0 


934 


AF019660 


Mus muscuius 
nuclear orphan 
receptor RORgamma 


0.002 


1330365 


(U58757) similar to nucleotide 
pyrophosphatases 


3.9 


935 


AF 100694 


Mus muscuius 
Pontin52 mRNA, 

lUlllfjldC Vila 


u.uuz. 


1 /JS5972 


(U46951) ORF5; Method: 
conceptual translation supplied 
by author 


3.7 


936 


V00508 


Human gene for 
epsilon-globin. 


0.002 


1333804 


(X560S2) protease 
[Ruminococcus flavefaciens] 


3.5 


937 


ABO 12 105 


Brassica rapa mRNA 
for SLG45, complete 
cds 


0.002 


4153876 


(AC005531) similar to mouse 
homeodomain-interacting 
Drotein kinase 2" similir tr» 

AF077659 (PID:g3702958) 


3.0 


938 


AJ005813 t 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
:nzvme 


0.002 


1070461 . 


ornithine carbamoyltransferase 
(EC 2. 1.3.3) - yeast 
'Saccharomyces cerevisiae) 
>gi|929866 (X83502) 
?id:e 130025 [Saccharomyces 
:erevisiae] >gi| 1008256 


2.8 


939 


i 

I 
t 

S41458 r 


od cGMP 
ihosphodiesterase 
>eta-subunit [human, 
nRNA. 3231 nt] 


0.002 


( 

3450883 i 


AF083334) fibroin [Antheraea 
lernyi] 


1.6 



WO 01/02568 
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Nearest Neighbor (BlastN vs. Genbank) 



_Nearesi Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION I DESCRIPTION 



i Drocoph i la . 



940 | X06286 



jmelanogaster Gart 
llocus with genes for 
|GA_RS=phosphoribos 
lylamineglycine 
lligase, 
|AIRS=phosphoribosy 
llformylglycinamidine 
[cyclo-ligase, 
GART=glycinamide 
ribotide 

Jtransformylase > 
gb|J02527|DROGAR 
IT D.melanogaster 
Gart gene encoding 
two polypeptides with] 
GAR synthase, AIR 
synthase, and GAR 
jtransformylase 
Ien2yme activities and 
la pupal cuticle gene 
I nested within intron 
|A of the Gart gene. 



P VALUE 



ACCESSION 



DESCRIPTION 



P value! 



0.002 



2662054 



|(AB004651) isocitrate lyase 



941 | AF015812 



i — *> 

IHomo sapiens RNA 

Ihelicase p68 

(HUMP6S) gene, 

complete cds 



(AB008374) alpha 3 type I 
0.002 f 3641659 Icoll agen 



942 1 X78925 



IH.sapiens HZF2 
ImRNA for zinc finger 
[protein 



0.002 



141624 



IZINC FINGER PROTEIN ZFP 
37 (MALE GERM CELL 
SPECIHC ZINC FINGER 

IPROTEIN) 



1.1 



1.0 



943 | AF074386 



ISambucus nigra 
Ihevein-Iike protein 
mRNA. complete cds 



0.002 



3879997 



|(Z49071) weak similarity with 
Imu-type opioid receptor (Swiss 
Prot accession number (P33535) 



1.0 



944 



Z69639 



Human DNA 

sequence from 

cosmid L241B9, 
[Huntington's Disease 
I Reg ion, chromosome 
|4pl6.3 contains 

polymorphic VNTR 
|pYNZ32. 



(.\F076292) TGF-beta/activin 
0 002 ^ 3523162 [signal transducer FAST- lp 



0.81 



H(2> 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



945 | AF074387 



946 I AF093268 



947 I AFO 17307 



948 I U11383 



949 | AF012899 



950 I AF086315 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-RedundamTfoTeTnT) 



Sambucus nigra 
hcvein-like protein 
mRNA. complete cds 



Rattus norvegicus 
homer- lc mRNA, 
complete cds 



Homo sapiens Ets- 
related transcription 
factor (ERT) mRNA, 
complete cds 



Drosophila 
melanogaster Ovo 
1028aa (ovo) mRNA, 
complete cds. 



Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



ACCESSION 



0.002 



2984161 



0.002 



101830 



0.002 



200531 



0.002 



2465207 



0.002 



3834294 



Homo sapiens full 
length insert cDNA 
lone ZD52F10 



0.002 



545067 



DESCRIPTION 



P value! 



(AE000761) hypothetical 
protein f Aquifex aeolicusl 



hypothetical protein B - chestnut 
blight fungus 



0.80 



0.72 



(MI8071) prion protein [Mus 
musculus] 



(AFO 1 6045) OVO-like I 
binding protein [Homo sapiensl 



0.72 



(U80846) No definition line 
found [Caenorhabditis elesans] 

action potential 
broadening potassium 
channel=Shab [Aplysia, bag cell 
neurons, head ganglia. Peptide 
905 aa] [Aplysia] 
>gi|743110|prf||2011375A K 
channel [Aplysia californica] 



0.35 



0.29 



0.15 



951 



X53096 



aureus genes 
encoding Sau96I 
DNA 

methyltransferase and 
Sau96I restriction 
endonuclease 



0.002 



952 | AB012105 



953 



X73973 



954 



S41458 



Brassica rapa mRNA 
for SLG45, complete 
cds 



2529575 



G.gallus RAR- 
gamma2 mRNA for 
retinoic acid receptor 



0.002 



729918 



rod cGMP 
phosphodiesterase 
beta-subunit [human. 
mRNA. 3231 nt| 



0.002 



586122 



0.002 



1017427 



(AF018164) kinesin-Iike protein 
3C [Homo sapiensl 



LA PROTEIN HOMOLOG (LA 
RIBONUCLEOPROTEIN) (LA 
AUTOANTIGEN HOMOLOG) 



TRICHOHYALIN 
>gi|42332l|pir||A40691 
trichohyalin - sheep >gi|295941 
(Z1S361) trichohyalin 



(X90569) elastic titin [Homo 
sapiensl 



0.11 



0.092 



0.073 



o.o i: 



WO 01/02568 PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbanlel 



ACCESSION 



955 | M35887 



DESCRIPTION 1 PVAI.irF 



D.melanoeaster 
defective chorion- 1 



fcl25 (dec- 1) gene, 
complete cds. 



0.002 



Laccaria bicolor 
glyoxal malate 
'synthase protein 

956 Af 03 4099 J mRNA. complete cds 
* Bactrocera dorsalis 

strain Tahiti 
mitochondria] D-loop 
, region, complete 

957 I AF033929 sequence 

Brassica rapa mRNA 
[for SRK45, complete 

AB012106 cds 



0.002 



958 



[Homo sapiens DEAD-j 
[box protein (BAT1) 
959 J AF029062 gene, partial cds 
Human atxxin-2 



9e-04 



8e-04 



8e-04 



Nearest Neighbor (BlastX vs. Non-Redundant Protein!) 



ACCESSION 



DESCRIPTION 



[(U88169) similar to 



1825606 



[molybdoterin biosynthesis 
IMOEB proteins [Caenorhabditis 
[elegansl 



1825593 



J(U88167) D2092.2 gene product 
[[Caenorhabditis elegans] 



960 I U70671 



[related protein 
mRNA, partial cds 



IDendrocopos 
[Ieucopterus clone 2 
microsatellite HrU2 
961 I AF051 709 repeat region 



8e-04 



962 I X 14077 



I Pea phy gene for 
Iphytochrome 
[apoprotein 



8e-04 



963 



[Homo sapiens 
[chromosome 21 , PI 
AC004497 |cloneLBNL#6 



8c-04 



[Homo sapiens 
Icanilage-derived C- 
964 J AF077344 type lectin 



[H.sapiens cpb72 gene 
965 ' X85117 |exons2.3.4.5.6.7 



Mus musculus 
Pontin52 mRNA, 
966 I AF100694 Icomplete cds 



8e-04 



8e-04 



8e-04 



8e-04 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



457146 



3702123 



2570059 



1345S59 



(L27838) rhoptry protein 
[Plasmodium yoeliil 



(AJ011707)TraD protein 
[Escherichia colil 



^004687) N-4 cytosine- ' 
specificmMethyltransferase 
'Neisseria gonorrhoeae] 



COPPER TRANSPORT 
PROTEIN CTR1 transport 
protein - yeast (Saccharomyces 
cerevisiae) gene product 
Saccharomyces ccrevisiacl 



P value! 



0.008 



le-06 



I <NONE> [ 



1 <NONE> I 



I <NONE> 



I <NONE> 1 



9.6 



S.5 



6.8 



6.7 



^(5 



WO 01/02568 



PCT7US00/18374 



SEQ 
ID 



_Nearesi Neighbor (BlastN vs. Gcnbank) 
accession! DESCRIPTION I p VALUE 



Nearest Neighbor (blastX vs. Nnn.R ed undant ProteinTT 



967 1 AF0314O3 



968 



L29252 



Homo sapiens 



ACCESSION 



MLL/AF4 
translocation 
breakpoint 
t(4;ll)(q21;23) 



DESCRIPTION 



969 



XI 6995 



Human (clone D13-2)| 
L-iditol-2 
dehydrogenase gene, 
exon 4, exon 5, exon 
6 and exon 7. 



8e-04 



2498926 



Mouse NIO gene for 
a nuclear hormonal 
binding receptor 



8e-04 



1488070 



(U63997) putative transposase 
[[Enterococcus faecium] 



8e-04 



(U47323) stromal cell protein 
J493833 |[Mus musculus] 



970 | M99412 



Human interleukin-8 
receptor (IL8RB) 
aene, complete cds 



971 



U37452 



Human Down 
Syndrome region of 
chromosome 2 1 
genomic sequence, 
clone A31D6-1C5. 



8e-04 



1346101 



4- AMINOB UTYRATE 

AMINOTRANSFERASE 
I TRANSAMINASE) (GAB A 

AMINOTRANSFERASE) 
Ihomolog - smut fungus 

(Ustilago maydis) >gi|881562 
lEmericella nidulans gamma- 
lamino-n-butyrate transaminase 
ISwiss-Prot Accession Number 
P1401QfUstilagomaydisl 



8c-04 



4164069 



972 I AF 100694 



(AF11 1093) latrophilin 3 splice 
Ivariant bbah fBo s taurus] 
Hi^UlRhllLAL U.UKJJ 



973 | AF093268 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Rattus norvegicus 
homer-lc mRNA, 
complete cds 



8e-04 



1352877 



PROTEIN IN RAD26-GEF1 
INTERGENIC REGION 
>gi|I077881|pir||S57057 

Iprobable membrane protein 
YJR038c - yeast 

[(Saccharomyces cerevisiae) 
>gi|l01568S (Z49538) ORF 

|YJR038c putative 

[Saccharomyces cerevisiae] 



IP VALUEl 



SMALL PROTEIN B 
HOMOLOG A43259. from E. 
hirae [Mycoplasma 

pneumoniae! I 5.5 



8e-04 



1788557 



(AE000312) orf, hypothetical 
Iprotein [Escherichia coli] 



5.2 



3.2 



0.83 



0.26 



0.23 



0.19 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


Ai_t-baMor' 




P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















974 


X83872 


H. vulgaris mRNA foi 
cAMP response 
element binding 
protein 


8e-04 


1175386 


HYPOTHETICAL 37.7 KD 
PROTEIN C18B 11.06 IN 
CHROMOSOME I 
>gi|2130289|pir||S58305 
hypothetical protein 
SPAC 1 8B 1 1 .06 - fission yeast 
hypothetical protein 
[Schizosaccharomyces pombe] 


0.005 


975 


M32514 


Rat simple sequence 
DNA, clone 5. 


8e-04 


2394492 


(AF024502) No definition line 
found [Caenorhabditis elegans] 


0.002 


976 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-04 


2981631 


(AB012223) ORF2 [Canis 
familiarisl 


0.001 


977 


X89211 


H.sapiens DNA for 
endogenous retroviral 
like clement 


8e-04 


2065210 


(Y12713) Pro-Pol-dUTPase 
polyprotein 


3e-04 


978 


U14391 


Human myosin-IC 
mRNA, complete cds. 


. 8e-04 


3142302 


(AC00241 1) Strong similarity to 
myosin heavy chain gb|Z34293 
from A. thaliana. [Arabidopsis 
thaliana] 


4e-16 


979 


L13612 


Drosophila 
melanogaster dead- 
box protein 
D.melanogaster 
DEAD-box gene, 
complete CDS 


8e-04 


3776027 


(AJ010475) RNA helicase 
[Arabidopsis thaliana] 


9e-24 


980 


AF074386 


Sambucus nigra 
!ievein-like protein 
mRNA, complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


981 


AF 1 00694 


VIus musculus 
Pontin52 mRNA, 
complete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


982 


AF.093268 < 


iattus norvegicus 
lomer-lc mRNA, 
:omplete cds 


7e-04 


<NONE> 


<NONE> 


<NONE> 


983 


j 
( 
i 
( 
i 

Z739S7 ( 


Human DNA 
equence from 
-osmid N120B6 on 
hromosome 22 
Contains ESTs, 
omplete sequence 
Homo sapiens] 


7e-04 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 



SEQ I 
ID I ACCESSION I 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



IP VALUE 



984 I ABO 12 106 



Brassica rapa mRNA 



for SRK45, complete 
cds 



7e-04 



<NONE> 



<NONE> 



985 | AF093268 



Rattus norvegicus 
homer- Ic mRNA, 
complete cds 



7e-04 



<NONE> 



986 | AF027174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



7e-04 



987 I AF100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



<NONE> 



7e-04 



988 | AJ0OS813 



Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 



<NONE> 



7e-04 



989 I AF064029 



Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 



990 I AF027174 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 



991 | AF027173 



Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
A) mRNA, complete 
cds 



992 | AF064029 



Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 



<NONE> 



7e-04 



<NONE> 



7e-04 



<NONE> 



7e-04 



<NONE> 



7e-04 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



<NONE> 



993 I AF 100694 



Mus musculus 
Pontin52 mRNA, 
omplete cds 



7e-04 



<NONE> 



U76524 



ambucus nigra 

bosome inactivating 
protein precursor 
mRNA, complete cds 



7e-04 



3327230 



<NONE> 



<NONE> 



(AB014608) KIAA0708 protein 
[Homo sapiens! 



9.5 



WO 01/02568 



PCT/US00/18374 





! Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Nrm-Rsrtiinrl.im PrrtteinO 


SEQ 
ID 


ACCESSIOf 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















995 


I U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


3327230 


(ABO 14608) KIAA0708 protein 
[Homo sapiens] 


9.3 


996 


j AF074387 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


7e-04 


3876455 


(Z93380) predicted using 
Genefinder; similar to 7tm 
receptor protein [Caenorhabditis 
elegans] 


7.1 


997 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


2128771 


hypothetical protein MJ1293 - 
Methanococcus jannaschii 
>gi|1591931 (U67570)M. 
jannaschii predicted coding 
region MJ1293 [Methanococcus 
jannaschii] 


6.2 


1 QQR 


I U09412 


Human zinc finger 
protein ZNF134 
mRNA. complete cds 


7e-04 


1083336 


glutathione transferase (EC 
2.5.1.18) piA - mouse 


5.4 


999 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7c-04 


473515 


(Ml 76 19) NADH 
dehydrogenase subunit ND4 
[Asterina pectinifera] 


3.7 


1000 1 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


1724097 


(U79772) female sex protein 
[Mercurialis annua] 


3.3 


1001 1 


AF100694 


Mus musculus 
Pontin52 mRNA. 
complete cds 


. 7e-04 


1197103 


D49747) core, env, and part of 
E2/NS1 


3.2 


10021 


f 

XI 6995 t 


vlouse N10 gene for 
nuclear hormonal 
indin.a receptor 


7e-04 


. 

t 

t 
1 
[ 

345372 e 


unco protein, long lorm - 
Caenorhabditis elegans 
>gi|258529|bbs|l 18648 
S47I68) UNC- 
>=immunoglobulin and 
hrombospondin type I 
ransmembrane protein 
alternatively spliced} aa] 
Caenorhabditis elegans] 
>gi|2662596 (AF03669S) C. 
legans UNC-5 (NID:°25852) 


2.7 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 


1 Nearest 
1 

1 ACCEssior 


Neighbor (BlastN vs. ( 
1 DESCRIPTION 


jenbank) 
P VALUE 


I Nearest Neish 
1 ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1003 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 


7e-04 


f 4204220 


(AB022866) mobilization 
protein 


2.5 I 


1004 


1 AF093268 


Rattus norvegicus 
homer- Ic mRNA, 
complete cds 


7e-04 


3201550 


(Y 17 11 6) fibrinogen-binding 
protein 


2.4 1 


1005 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA. complete cds 


7e-04 


1 1174264 


(U45966) polyprotein [Hepatitis 
G virus] 


0.73 1 


I (WU 


APA')7ni 
t\r\j£,f i /j 


cellulose synthase 
catalytic subunit (Ath- 
A) mRNA. complete 
cds 


7e-04 


135308 


TRANSCRIPTION FACTOR 
JUN-D 


0.065 I 


I 1007 




H.sapiens EWS gene, 
intron 6, 
polymorphism 


7e-04 


728836 


! ! ! ! ALU SUBFAMILY SP 
WARNING ENTRY 


0.001 J 


1008 


AJ0O5813 


Arabidopsis thaliana 
mRNA for 
neoxanthin cleavage 
enzyme 


7e-04 j 


1633564 


(U47924) C8 [Homo sapiensl 


9e-09 1 


1009 


AF074386 


Sambucus nigra 
levein-like protein 
mRNA. complete cds 


6e-04 


284171 


Ig epsilon chain C region form 3 
- human 


1.3 1 


lOlOj 


AB012106 


Brassica rapa mRNA 
for SRK45, complete 
:ds 


6e-04 1 


3845262 


(AE001414) BRAHMA 

LM uiuivjg llCUCuoC 

superfamily II) 


0.25 1 


1011 1 


; 
L 

c 
s 

AL034404 s 


Human UNA 
equence from clone 
U7C12on 

hromosome Xp22. 1 1 

equence [Homo 
apiens] 


3e-04 1 


<NONE> 


<NONE> 


<NONE> 1 


1012| 


\ 

M99701 n 


[omo sapiens (pp2l) 
iRNA. complete cds. 


3e-04 | 


<NONE> 


<NONE> 


<NONE>| 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



SEQ 

ID I ACCESSION 



DESCRIPTION 



P VALUE 



Nearest Neighbor (BlastX vs. Non-Redundant Prote 



ACCESSION 



DESCRIPTION 



PVALUEl 



1013.1 U00227 



Ovis aries Merino 
breed DR beta-chain 
antigen binding 
domain. MHC class II 
DRB (Ovar-DRB24) 
gene, partial cds. 



3e-04 



<NONE> 



<NONE> 



<NONE> 



1014 1 AF074387 



Sambucus nigra 
hevein-like protein 
mRNA. complete cds 



3e-04 



<NONE> 



<NONE> 



10151 U95102 



Xenopus laevis 
mitotic 

phosphoprotein 90 
mRNA, complete cds 



1016| AB012106 



Brass ica rapa mRNA 
for SRK45, complete 
cds 



1017 1 AJ010737 



Mus musculus DNA 
for microsatellite 3kb 
upstream Ibp gene 



1018 1 AF0S3137 



Homo sapiens histone 
dcacetylasc 3 gene, 
exons 4, 5, 6, 7. 8, 9, 
and 10 



1019 1 AF027173 



Arabidopsis thaliana 
cellulose synthase 
atalytic subunit (Ath 
A) mRNA. complete 
cds 



10201 AC004173 



1021 | X57025 



1022 1 X77090 | 



Homo sapiens clone 
UWGC:y23x0U 
from 6p2 1 . complete 
sequence [Homo 
sapiens 



Human IGF-I mRNA 
for insulin-like 
growth factor I 



H. sapiens IL-IRa 
gene. 



3e-04 



999418 



(L19655) ORE [Tomato 
ringspot virus! 



3e-04 



2367460 



(AF0 11415) putative 
pheromone receptor [Mus 
musculus} 



8.3 



7.0 



3e-04 



4106549 



(AF104411) neuronal-specific 
septin 3 TMus musculus] 



3e-04 



416702 



NADH-b£PtNDENT FLAVIN 
OXIDOREDUCTASE acid- 
inducible - Eubacterium sp 
>gi|I381570(U574S9) 
NADH.flavin oxidoreductase 
Eubacterium sp. VPI 127081 



5.5 



5.3 



3e-04 



1785789 



3e-04 



558521 



3e-04 



4206707 



3e-04 



1065941 



(Y08502) orfl 1 Id [Arabidopsis 
thaliana] 



5.1 



(D289I7) polyprotein [Hepatitis 
C virus] 



1.1 



(API 18122) putative outer 
membrane protein OmpU 



0.65 



(U40799) F42C5.7 gene product 
[Caenorhabditis elegans] 



0.12 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlastX vs. Non-ReriimH.-im p m t»;„<^ 


SEQ 
ID 


Acctssior* 


I DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Pseudorabies virus 










1023 


M34651 


with upstream and 

downstcam 

sequences. 


3e-04 


2746853 


(AF040650) contains similarity 
to sodium-potassium-chloride 
cotransport proteins 


7e-05 


1024 


Z36011 


S.cerevisiae 
chromosome II 
reading frame ORF 
YBR142w 


3e-04 


2500537 


PROBABLE ATF- 
DEPENDENT RNA 
HELICASE HAS 1 
>gi|626265|pir||S47451 
hypothetical protein YMR290c 

RNA hpllfa^p r^nrrhnrnmvrw: 

cerevisiae] 


4e-08 


1025 


AF020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04' 


1465834 


(U64857) No definition line 
found [Caenorhabditis elegans] 


6e-14 


1026 


L26049 


Chlamydomohas 
reinhardtii dynein 
heavy chain alpha 
(ODA1 1) gene, exons 
2-15, and partial cds. 


3e-04 


3876775 


(Z8 1077) predicted using 
Genefinder; Similarity to Yeast 

nmtpin ROAR /"TP -r:^fi7*s1 1 \ 




1027 


AF020286 


Dictyostelium 
discoideum 2034 
gene, partial cds 


3e-04 


1465834 


(U64857) No definition line 
bund [Caenorhabditis elesans] 


le-17 


1028 


X798U 


S.cerevisiae ACT3 
jene 


3e-04 


3876090 c 


(/.cw>j:>) Similarity to Yeast 
uridine kinase 

(SW:URK1_YEAST); cDNA 
EST EMBL:Z 14695 comes 
from this gene; cDNA EST 
CEMSE17F comes from this 
gene; cDNA EST 
EMBL:D67355 comes from this 
gene; cDNA EST yk209hl.5 
:omes from this ge... 


7e-31 


1029 


< 
c 

AF027173 


>\rabidopsis thaliana 
:ellu!ose synthase 
:atalytic subunit (Ath- 
\) mRNA. complete 

ds 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1030 


f 

( 

M22970 1 


luman pancreatic 
ihospholipase A-2 
PLA-2) gene, exons 
to 3. 


2e-04 


<NONE> 


<NONE> 


<NONE> 



WO 01/02568 



PC17US00/18374 



SEC 
ID 


Nearesi 

> 

ACCESS [Of 


Neighbor (BlastN vs. 

DESCRIPTION 
Human biNA 


Genbank) 
P VALUE 


Nearest Neiah 
ACCESSION 


bor (BlastX vs. Non-Redundant F 
DESCRIPTION 


totems) 
P VALUE 


1031 


I Z68686 


sequence from 
cosmid N2E9 on 
chromosome 22. 
Contains FST 
complete sequence 
(Homo sapiens] 


2e-04 


<NONE> 


<NONE> 


<NONE> 


1032 


X95154 


exon 4 > :: 

emb|A62779|A62779 
Sequence 20 from 
Patent WO9719110 


2e-04 


" <NONE> 


<NONE> 


<NONE> 


1033 


AJ005813 


Arabidopsis thaliana 

mRNA fnr 

neoxanthin cleavage 
enzyme 


2e-04 


<NONE> 


<NONE> 


<NONE>l 


1034 


API 00694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


2e-04 


<N0NE> 


<NONE> 


<NONE>| 


1035! 


AE00I4I S 

ru<w It 1 J 


Plasmodium 
falciparum 
chromosome 2, 
section 52 of 73 of 
the complete 
secyuencc 


2e-04 


<NONE> 


<NONE> 




10361 


AF090115 


Lycopersicon 

"SCliIentiim c v tr\<r% 1 i r* 

class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
:omplete cds 


2e-04 


<NONE> 


<NONE> 


<NONE>| 
<NONE> J 


10371 


I 
( 
1 

AC00O958 s 


4omo sapiens 
subclone 6_d9 from 
'1 H2I) DNA 
equence 


2e-04 


<N0NE> 


<NONE> 


<NONE> 1 


1038 1 


1 
r 

AF093268 c 


lattus norvegicus 
omer-lc mRNA, 
omplete cds 


2e-04 


( 

2501523 J 


:D59 GLYCOPROTEIN 
'RECURSOR 


7.1 1 


1039 1 


S 
r 
P 

U76524 n 


ambucus nigra 
bosome inactivating 
rotein precursor 
iRNA, complete cds 


2e-04 


( 

2765360 v 


Y 13925) cathepsin L2 [Penaeus 
annamci] 


6.S 1 
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Nearest 


Neighbor (BlastN vs. 


Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant ProteinO 


SEQ 
ID 


) 

ACCESSIOf 


4 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


I CMC 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA, complete 
cds 


- 

ze-U4 


133636 


kNA POLYMERASE " 
>gi|67126|pir||RRXPLC RNA- 
directed RNA polymerase (EC 
2.7.7.48) - lymphocytic 
choriomeningitis virus (strain 
Armstrong 53b) >$i|33I369 


5.2 


1041 


ABO 12 106 

1 » A—* L _ 1 \J\J 


Brassica rapa mRNA 
for SRK45, complete 
cds 


1^ rut 

ze-04 


3822155 


(AF0746I3) type II secretion 
protein [Escherichia coli 
0157:H7] 


4.0 


1042 


U76524 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


2e-04' 


1718125 


REGULATORY PROTEIN E2 
>gi| 1020222 type 361 


0.38 


1043 


XI 7058 


Sus scrofa mRNA for 
glucose transporte 
protein 


2e-04 


3341906 


(AB009593) xylose transporter 


2e-15 


1044 




Homo sapiens 
candidate tumor 
suppressor pp32rl 


le-04 


<NONE> 


<NONE> 


<NONE> 


1045 


X98890 


S. tuberosum mRNA 
for inorganic 
phosphate 
transporter. StPTl 


le-04 


624126 


(U425S0) a65L [Paramecium 
bursaria Chlorella virus 1] 


7.9 


1046 


LI 4930 


Glycine max (Rab7p) 
mRNA. complete cds. 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1047 


AJ009970 


Mus musculus 
thromboxane A2 
receptor gene, exon 3, 
partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1048 


Y11896 


vl.musculus mRNA 
"or Brx gene, partial 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1049 


( 

L10832 i 


'olistes annularis 
clone pan48AAT) 
andem repeat region. 


9e-05 


<NONE> 


<NONE> 


<NONE> 


1050 


h 

2 

AFO5501I s 


lomo sapiens clone 
4587 mRNA 
equence 


9e-05 


K 
I 

f 
I 

% 
I 

g 
E 

O 
b 

E 

3880586 g 


^< | y/b8) cUNAhSI 
;MBL:D28009 comes from this 
;ene; cDNA EST 
•MBL:D28008 comes from this 
ene; cDNA EST 
:MBL:D32478 comes from this 
ene; cDNA EST 
-MBL:D34508 comes from this 
ene;cDNA EST 
MBL-D37581 comes from this 
ene; ... 


7.6 
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SEC 
ID 


Nearesi 

) 

ACCESSIOt 


Neighbor (BlastN vs. 
\T DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neiph 
ACCESSION 


our tciasLA vs. £Non*Keaundant r 
DESCRIPTION 


roteins) 
P VALUE 


1051 


U76524 


Sambucus nigra 
ribosome inactivatin 
protein precursor 
mRNA, complete cds 


9e-05 


• 

3024292 


RHODOPSIN >gi|2290717 
(AF000947) rhodopsin [Sepia 
officinalis] 


6.7 


1052 


1 Z58294 


H.sapiens CpG DNA 
clone 34d6, forward 
read cp?34d6.ftla . 


9e-05 


3885496 


(AF064825) heparin/heparan 
sulfate N-acetylglucosaminyl N- 
deacetylase/N-suIfotransferase 
[Bos taurus] 


0.65 


1053 


I D87451 


Human mRNA for 
KIAA0262 gene, 
complete cds 


9e-05 


" 3874739 


(Z66495) similar to claustrin 
like 


0.004 


1054 


L37092 


Mus musculus cyclin- 
dependent kinase 
homologue 


9e-05 


3080513 


(AL022598) hypothetical 
protein 


4e-09 


1055 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1056 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1057 


AF074386 


Sambucus nigra 
hevein-like protein 
mRNA, complete cds 


8e-05 


<NONE> 


<NONE> 


<NONE> 


10581 


D10102 


Homo sapiens DNA 
from cosmid 
:lone:844, GT repeat 
sequence 


8e-05 


<NONE> 


<NONE> 


<NONE> 


1059 1 


I 

e 
s 
F 

U72396 n 


-ycopersicon 
sculentum class II 
mall heat shock 
roteinLe-HSP17.6 
nRNA. complete cds 


8e-05 


i 
1 

; 

I 

( 
h 

1176475 | 


UiPUlHiilKJAL 8U.4KD 

3 R0TEIN IN SMC3-MRPL8 

>gi|l078237|pir||S56849 
>robable membrane protein 
t\TL073w - yeast 
Saccharomyces cerevisiae) 
•gi|895898 (X88851) 
ypothetical protein YJL073w 
Saccharomyces cerevisiae] 


6.0 


1060 1 


1- 
t'< 

X71934 > 


1. sapiens XB gene . 
jr tenascin-X, repeal 
III 


8e-05 


n 
I 

( 

285207 n 


licrotubule-associated protein, 
lOKtau-rat >gi|207158 
VI84156) big tau [Rattus- 
orvegicus] 


3.7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1061 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath 
B) mRNA. complete 
cds 


8e-05 


4049682 


(AF063866) ORF MSV092 
hypothetical protein 
[Melanoplus sanguinipes 
entomopoxvirusl 


2.1 


1062 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA, 
complete cds 


8e-05 


3861019 


(AJ23527I) unknown 
[Rickettsia prowazekii] 


5e-14 


1063 


AF027174 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
B) mRNA, complete 
cds 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1064 


L04193 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


7e-05 


<NONE> 


<NONE> 


<NONE> 


1065 


X6I609 


B.napus gene for 

T T T/"^ I I "T, inn TTT 

i-rtt, ll 1 ype llx 
chlorophyll a/b 
binding protein 


7c-05 


2132314 


hypothetical protein YPR174c - 
yeast similarity to a nuclear 
lamin from C. elegans (PIR 
accession number S42257) 
Saccharomvces cerevisiae] 


8.9 


1066 


AF064029 


Helianthus tuberosus 
lectin I mRNA, 






(AB006757) PCDH7 (BH- 
Pcdh)c [Homo sapiens] 


5.7 


1067 


AF027173 


Arabidopsis thaliana 
cellulose synthase 
catalytic subunit (Ath- 
A) mRNA, complete 
cds 


7e-05 


2493696 


HYPOTHETICAL 21.5 KD 
PROTEIN (ORP 185) 
>gi| 1480440 (U34204) 
ORF1S5; hypothetical 21.4 kD 
orotein [Brassica oleracea] 


5.2 


1068 


1 
1 

AF093268 i 


■iattus norvegicus 
lomer- lc mRNA, 
•omplete cds 


7e-05 


1 

2501029 1 


PROBABLE L^UcYL-TRNa 
SYNTHETASE. 
V11TOCHONDRIAL 
PRECURSOR (LEUCINE-- 
rRNA LIGASE) (LEURS) 
<IAA0028 [Homo sapiens] 


1.4 
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Nearest Neighbor (BlastN vs. ( 


jenbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 






P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human DNA 










1069 


Z68758 


sequence from 
cosmid cN85EI0 on 
chromosome 22ql 1.2 
gter 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1070 


X60653 


human Histone H3.3 
pseudogene (CE- 
456) 


3e-05 


<NONE> 


<NONE> 


<NONE> 


1071 


Z58294 


H.sapiens CpG DNA, 
clone 34d6, forward 
read cpp34d6.ftla . 


3e-05 


• 1706241 


GUANYLYL CYCLASE GC-E 
PRECURSOR cyclase receptor 
[Mus musculus] 


9.6 


1072 


AF04325 1 


Homo sapiens 
mitochondrial outer 
membrane protein 
(Tom40) gene, 
nuclear gene 
encoding 
mitochondrial 
protein, exons 1 
through 6 


Je-U5 


113980 


AMINE OXIDASE [FLAVIN- 
CONTAINING] B oxidase 
(flavin-containing) (EC 1.4.3.4) 
B - human B [human, platelet, 
Peptide Partial, 520 aa] [Homo 
sap_iens] 


8.9 


1073 


M31104 


Chicken progesterone 
receptor gene, 
encoding forms A and 
B. exons 1 and 2. 


je-iD 


1170841 


IG GAMMA LAMBDA 
CHAIN V-II REGION 


4.8 


1074 


APD 1 9 COO 


Sambucus nigra 
ribosome inactivating 
srotein precursor 
mRNA, complete cds 


3e-05 


543684 


ribosomal protein S3 - 
Chiamydomorias humicola 
chloroplast (fragment) 


4.2 


1075 


L22206 


Human vasopressin 
receptor V2 gene,. 
:omplete cds. 


3e-05 


791207 


U20615) Gnotl homeodomain 
protein [Gallus gallus] 


1.8 


1076 


AF093268 ( 


tattus norvegicus 
lomer-lc mRNA, 
:omplete cds 


3e-05 


3237340 


AF033361) polyprotein 
Hepatitis C virus) 


0.94 


1077 


i 
1 

AF 100694 < 


Vlus musculus 
5 ontin52 mRNA, 
"omplete cds 


3e-05 


( 

2879805 i 


AL021813) hypothetical 
protein 


0.001 


1078 


t 
1 

AF 100694 i 


vlus musculus 
>ontin52 mRNA, 
omplete cds 


3e-05 


( 

3877951 ( 


ZS1555) predicted using 
jene finder 


3e-07 



^7 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 


ACCESSIOh 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1079 


AF090115 


Lycopersicon 
esculentum cytosolic 
class II small heat 
shock protein HCT2 
(HSP17.4) mRNA. 
complete cds 


2e-05 


<NONE> 


<NONE> 


<NONE> 


1080 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


3880197 


(Z81 132) predicted using 
Genefinder 


2.4 . 


1081 


AF087989 


Homo sapiens full 
length insert cDNA 
clone YX29D10 


2e-05 


1 13667 


!!!! ALU CLASS B WARNING 
ENTRY !!!! 


1.8 • 


1082 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


474896 


(L31967) mating type protein 
Coprinus cinereus] 


1.4 


1083 


AF064029 


Helianthus tuberosus 
lectin 1 mRNA, 
complete cds 


2e-05 


2266988 


(Y 13274) M33 polycomb-like 
protein [Mus musculus] 


0.62 


1084 


U67415 


Equus caballus UCD- 
E-CA-467 
dinucleotide repeat 
region, complete 
sequence 


le-05 


<NONE> 


<NONE> 


<NONB> 


1085 


X67277 


H. sapiens BGP gene 
for biliary 
glycoprotein, 
promoter region and 
exon I 


le-05 


<NONE> 


<NONE> 


<NONE> 


1086 


X85117 


H.sapiens epb72 gene 
cxons 2,3.4,5,6.7 


le-05 


<NONE> 


<NONE> 


<NONE> 


1087 


U88328 < 


vtus musculus 
suppressor of 
:ytokine signalling-3 


le-05 


443877 


Z29457) core region; 
pid:g443877 [Hepatitis C virus] 
vims] 


3.9 


1088 


1 

Y12853 < 


-lorno sapiens P2X7 
:ene, exon 4-8 


le-05 


< 

3878726 


Z66498) similar to cuticle 
:ollagen; cDNA EST 
SMBL:D75584 comes from this 
»ene 


0.36 


1089 


1 

( 

AE00114O t 


iorrelia burgdorferi 
section 26 of 70) of 
ic complete genome 


le-05 


( 
t 
< 

3860719 f 


AJ235270) GLUT AM YL- 
RNA AMIDOTRANSFERASE 
5UBUNIT A (gatA) [Rickettsia 
irowazekii) 


4c- 15 
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Nearest 


Neighbor CBIastN vs. Genbank) 


Nearest Neishbor (BlastX vs. Non-Redundant Proteinsl 


SEQ 
ID 


ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1090 


AJ224112 


Homo sapiens gamm 
adaptin gene, exon 2 
and flanking intronic 
sequences 


i 

9e-06 


<NONE> 


<NONE> 


<NONE> 


1091 


AB000565 


Homo sapiens DNA 
for repeat sequence 
Alu . 


9e-06 


72879 


translation initiation factor IF- 2 
Escherichia coli 


5.1 


1092 


Z78985 


H. sapiens flow-sortec 
chromosome 6 
Hindlll fragment, 
SC6pA20B4 


9e-06 


• 159975 


(M65164) 51C surface protein 
[Paramecium tetraurelia] 


4.8 


1093 


Z21677 


Thermotoga maritima 
DNA for spc operon 


9e-06 


585879 


50S RIBOSOMAL PROTEIN 
L2 maritima >gi|437926 
(Z21677) ribosomal protein L2 


7e-14 


1 HQ/1 


ArUj 1494 


Drosophila hydei 
Dhc7 (Threads) 
mRNA, complete cds 


9e-06 


729377 


DYNEIN BETA CHAIN, 
CILIARY sea urchin 
(Anthocidaris crassispina) chain 
[Anthocidaris crassispina] 


4e-18 


1095 


AF051315 


Homo sapiens 
placental protein 
17al (PP17) mRNA, 
complete cds 


4e-06 


<NONE> 


<NONE> 


<NONE> 




a rnn i /t £r\ 
ACUU1460 


Homo sapiens 
(subclone 2_f4 from 
BACH 107) DNA 
sequence 


4e-06 


2648304 


(AE000952)ISA1214-6. 
putative transposase 


6.2 


1097 


X85030 


H.sapiens mRNA for 
skeletal muscle- 
Specific calpain 


4e-06 


4239857 


(AB016726) calpain 
[Schistosoma japonicum] 


0.006 


1098 


M75162 


Human polymorphic 
arylamine N- 
acetyltransferase 


3e-06 


<NONE> 


<NONE> 


<NONE> 


1099 


AB009999 < 


Rattus norvegicus 

_nki A £ *T\n 

mRNA tor CuP- 
diacylglycerol 
synthase, complete 
:ds 


3e-06 


3879045 


Z70309)R102.6 
Caenorhabditis eleaans] 


7.3 


1100 


] 
< 
1 

Z78985 J 


-I.sapiens flow-sorted 
"hromosome 6 
-lindlll fragment, 
>C6pA20B4 


3e-06 


< 

266529 r 


V1ERCURIC REDUCTASE 
HG(II) REDUCTASE) 
»gi|418744|pir||S3016S 
nercury(II) reductase 


6.5 


1 101 


1 
f 
e 

AB012190 c 


iomo sapiens mRNA 
or NeddS-activating 
nzyme hUbaj, 
omplete cds 


3e-06 


( 

3877938 [ 


Z79697) F58H10.1 
Caenorhabditis eleaans] 


6.3 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSKtt 




r VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1102 


AF041056 


WSCR4 gene, exons 
3 and 4 


3e-06 


1568583 


(Z80775) hypothetical protein 
Rv0044c 


1.9 


1103 


X00777 


Mouse E(d) beta gent 
5' flanking region anc 
exon 1 


3e-06 


1680722 


(U72497) fatty acid amide 
hydrolase [Rattus norvegicus] 


0.008 


1104 


D21205 


Human mRNA for 
estrogen responsive 
finger protein, 
complete cds 


3e-06 


563127 


(U09825) acid finger protein 
(Homo sapiens] 


le-05 


1105 


Z47046 


Human cosmid 
QLL2C9 from Xq28 


le-06 


• <NONE> 


<NONE> 


<NONE> 


1106 


L26261 


Human MHC class III 
HLA-RP1 gene. 


le-06 


<NONE> 


<NONE> 


<NONE> 


1107 


Ml 3402 


Rat 5S RNA gene, 
clone 5S-2. 


le-06 


<NONE> 


<NONE> 


<NONE> 


1108 


X68793 


H.sapiens gene for 
antithrombin III 


le-06 


<NONE> 


<NONE> 


<NONE> 


1109 


AF003540 


Homo sapiens 
Krueppel family zinc 
finger protein 


le-06 


2507553 


ZINC FINGER PROTEIN 33A 
(ZINC FINGER PROTEIN 
KOX31) (KIAA0065) 
(HA0946) Kruppel-related. 
[Homo sapiens] 


0.098 


1110 


L42096 


Homo sapiens 
(subclone 10_d2 from 
PI H21) DNA 
sequence. 


le-06 


1330401 


(U58762) T27F7.1 gene product 
[Caenorhabditis elegans] I 


0.015 


1111 


Z69925 


Human UNA 
sequence from 
cosmid cNl 16A5, 
between markers 
D22S2S0 and 
D22S86 on . 
:hromosome 22ql2 
contains EST 


9e-07 


<NONE> 


<NONE> 


<NONE> 


1112 


t 
r 

D90217 1 


5. cerevisiae gene for 

rmL33. 

nitochondrial 

i bosom a I proteins of 

argesubunit 


9e-07 


( 

; 

( 
f 

) 

3879097 


Z81109) predicted using 
jenefinder; similar to 
odium/phosphate transporter; 
•DNA EST yk326f6.3 comes 
rom this gene, cDNA EST 
/k326f6.5 comes from this gene 
Caenorhabditis elegans] 


7.1 
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SEQ 
ID 


Nearest 
ACCESSION 


Neighbor (BlasiN vs. 
>T DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1113 


AF012899 


Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 


9e-07 


! 1330345 


(US8/15) coded for by 
eiegans cl>NA ykJ4bl.3; coded 
for by C. eiegans cDNA 
ykl3hl0.5; coded for by C. 
eiegans cDNA yk46e8.5; coded 
for by C. eiegans cDNA 
yk46d5.5; coded for by C. 
eiegans cDNA yk43c2.5; coded 
for by C. eiegans cDNA 
yk46e8.... 


2e-29 


1114 


AF086562 


Homo sapiens full 
length insert cDNA 
clone ZE16C03 


4c-07 


1072210 


(U40945) coded for by C. 
eiegans cDNA yk74b9.3; coded 
for by C. eiegans cDNA 
yk74b9.5; similar to repeat of 
calcium channel alpha subunits; 
similar to tetracycline resistance 
protein; similar to hypothetical 
protein in HSP30-PMP1 region 
(SP... 


3.9 


1115 


L39062 


Homo sapiens 
interleukin 9 receptor 
IL9R pseudogene, 
exons 1-9 


4e-07 


3879983 


U4b/ U i) similar to 
transforming protein etc2; 
cDNA EST EMBL.D34137 
comes from this gene; cDNA 
EST EMBL:D37172 comes 
from this gene; cDNA EST 
EMBL:D76266 comes from this 
gene; cDNA EST 
EMBL:D70493 comes from this 
gene; cDNA ... 


3.3 


1116 


< 

■ 1 
1 

£ 
i 
S 

c 

I 

F 
4 

Z69364 a 


Human DNA 
sequence from 
:osmid L96F8, . 
Huntington's Disease 
Region, chromosome 
ipl6.3 contains EST 
nd cDNA. > :: 
mb|Z69365|HSL96F 
A Human DNA 
equence from 
osmid L96F8, 
luntington's Disease 
.egion, chromosome 
p!6.3 contains EST 
nd cDNA. 


4e-07 


( 

3493176 b 


*F022SS9) latent TGF beta 
indins: protein [Mus musculus] 


3.0 
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SEC 
ID 


Nearest Neiahbor (BlastN vs. 
> j 
ACCESSION! DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor CBIastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


111"/ 


D79986 


|nuiuaii inrvi^/A iui 

KIAA0164 gene, 
complete cds 


4e-07 


403803 1 


(AC005936) hypothetical 
protein [Arabidopsis thalianal 


0.30 


1112 


D43950 


1 Human mRNA for 
KIAA0098 gene, 
1 partial cds 


3e-07 


<NONE> 


<NONE> 


<NONE> 


1119 


AF037168 


Arabidopsis thaliana 
DnaJ homologue 
(AU6) mRNA, 
complete cds 


3e-07 


3881075 


tAJ-UJ^bD /; predicted using 
Genefinder; similar to DnaJ 
domain ; Thioredoxin; cDNA 
EST yk433f3.5 comes from this 
gene; cDNA EST 
EMBL:D32359 comes from this 
gene; cDNA EST 
EMBL:D34721 comes from this 
gene; cDNA EST yk433f3.3 c... 


3e-09 


1120 


X69838 


H.sapiens mRNA for 
G9a 


3e-07 


3873414 


(U00043) similar to D. 
melanogaster trithorax protein 


3e-29 


1121 


AB011124 


Homo sapiens mRNA 
for KIAA0552 
protein, complete cds 


2e-07 


2618749 


(U90880) hypothetical protein 
2; predicted using XGrail 


2.0 


1122 


K03012 


Human cellular fms 
proto-oncogene, 
partial cds. 


le-07 


<NONE> 


<NONE> 


<NONE> 


1123 


1 

AB016195 i 


lomo sapiens DNA, 
nicrosatellite and Alu 
epeat region 


le-07 


728837 


!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 


0.095 


1 124 


1 

Y I 6795 h 


too sapiens 
wihHaA pseudogene 


4e-08 


<NONE> 


<NONE> 


<NONE> 


1125 


I 
It 

It 
li 

AB012624 c 


lomo sapiens FLU 
ene for ERGB 
ranscription fuctor, 
ntron 4 and partial 
ds 


4e-08 


728836 


!!! ALU SUBFAMILY SP 
WARNING ENTRY 


3.6 


1126 


IHomo sapiens oggl 
AJ131341 gene, exons 1-7 


4e-08 


1 13668 1 


!!! ALU CLASS C WARNING 
ENTRY !!!! 


3e-05 


1127 


IHomo sapiens 
Ksubclone l_clOfrom 
PI H69) DNA 
L81902 sequence 


3e-0S 


4225950 ( 


AJ13270I) centaurin gamma IB 


1.8 


112S 


Gallus gallus mRNA 
for high mobility 
Y 17968 group 1 protein 


3e-0S \ 


( 
s 

3041855 f 


AC004537) similar to tumor 
uppressor p33INGl; similar to 
VF044076 (PID:g282920S) 
Homo sapiens) 


3e-31 


1129 


IHomo sapiens FGFR- 
Y 13901 |4°ene 


le-OS 


<NONE> 


<NONE> 


cNONE> 



WO 01/02568 PCTAUS00/18374 



I SEQ 
ID 



Nearest Neighbor (BlastN vs. Genh.ini^ 



_ACCESSION| DESCRIPTION 



H30 1 L22024 



JJ3l|AFpi2899 



Nearest Neighbor (BlastX vs. Nnn-Redundam Protein 



Mesocricetus auratus 
serum amyloid P 
component gene, 
complete cds. 

Sambucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



.H32I X14034 



Hitman mRNA for 
phospholipase C > :: 
l gb|M37238|HUMPL 
C Human 
phospholipase C 
mRNA, complete cds 
H.sapiens CpG DNA, 



1133| Z59381 



clone 152bl0, 
I forward read 
Icpgl52bl0.ttla . 



1134| L81839 



[Homo sapiens 
((subclone 2_h3 from 
PI H43) DNA 
[sequence 



P value I ACCF.ssrr>M 



DESCRIPTION 



le-08 



le-08 



<NQNE> 



<NONE> 



pvalueI 



<NONE> 



<NONE> 



<NONE> 



le-08 



<NONE> 



<NONE> 



le-08 



<NONE> 



<NONE> 



_1135| X14448 



1136 



Human GLA gene for 
|alpha-D-galactostdas 
A (EC 3.2.1.22 ) 

uraan Dl 
sequence from clone 
799F15 on 
chromosome Xq25, 
Icomplete sequence 
AL023774 [Homo sapiens! 



1137 | X64639 



|H. sapiens DNA 
repetitive 
subtelomeric-like 
sequence (522 bp) 



'138 1 U97058 



[Human HuD gene 
5'UTR 



le-08 



<NONE> 



le-08 



3334427 



le-08 



1354935 



. . <NONE > 

[HYPOIHFIIlALPHOTHIN ' 

IMJ1207 Methanococcus 
Ijannaschii >gi|1591S37 
(U67562) protease synthase and 
Isporulation negative regulator 
Pail, putative [Methanococcus 
Ijannaschii) 



(U58330) probable copper- 
[ transporting atpase 



<NONE> 



<NONE> I 



le-08 



77356 



[hypothetical 70K protein 
[eggplant mosaic virus 



5e-09 



3387886 



[(AF070530) unknown [Homo 
|sapiens] 



9.1 



1.2 



0.098 



9.5 



^33 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



Human UNA 



P VALUE 



11391 Z8218I 



1140 1 AJ006587 



1141 1 Yl 1 108 



1142 | AE001223 



1143| Z47046 



U44 I AG000746 



sequence from 
cosmid E86D10 on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 



Mus musculus mRNA 
for translation 
initiation factor eEF2 
gamma X 



H.sapiens WNT8B 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



5c-09 



5e-09 



Treponema pallidum 
section 39 of 87 of 
the complete genome | 4e-09 



4e-09 



Human cosmid 
QLL2C9 from 



Xq28 



Homo sapiens 
genomic DNA, 21q 
region, clone: 
Tl7IBm40 



4e-09 



4e-09 



DESCRIPTION 



728831 



!!!! ALU SUBFAMILY J 
WARNING ENTRY 



1872200 



2854198 



3334189 



(U22376) alternatively spliced 
product using exon 13A 



(AF045646) contains similarity 
to collagens 



CELL DIVISION PROTEIN 
FTSY HOMOLOG 



104045 



1 145 | M74002 



1146 1 U95094 



Human arginine-rich 
nuclear protein 
mRNA. complete cds.l 



Xenopus laevis XL- 
INCENP (XL- 
INCENP) mRNA, 
complete cds 



4e-09 



2e-09 



113666 



fibroblast growth factor receptor 
Al precursor - African clawed 
frog >gi|2 14894 (M55163) 
fibroblast growth factor receptor 
[Xenopus laevis] 



P VALUE! 



! !!! ALU CLASS A WARNING 
ENTRY !!!! 



3875371 



U-juj^u; luiiljiiu, a i jiiiil anu 
arginine rich domain, possesses 
weak similarity with the RNA 
binding domains from RNA 
splicing factor U2AF 65 KD 
subunit; cDNA EST 
EMBL.D64658 comes from this 
gene; cDNA EST 
EMBL:D66829 comes f... 
>gi|3878699|gnl|PID|e 1 35 1 700 
possesses weak similarity with 
the RNA binding domains from 
RNA splicing factor U2AF 65 
KD subunit; cDNA EST 
EMBL.D64658 comes from this 
gene; cDNA EST 
EMBL.D66S29 comes f... 



2494337 



ENDO-M-BETA-XYLANASE 
PRECURSOR sp.1 



8.4 



0.64 



4.0 



1.5 



1.3 



0.33 



3e-06 



4.9 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



1147 



iDrosophila 
Imelanogaster UDP- 
|glucose:glycoprotein 
Iglucosyltransferase 
U20554 ImRNA. complete cds 



2e-09 



2499087 



OLUl0SL:GLY<JUPK0TKlN 
GLUCOSYLTRANSFERASE 
PRECURSOR (DUGT) 
glucosy I transferase - fruit fly 
(Drosophila sp.) 
glucosyltransferase precursor 
Drosophila melanogaster] 



4e-24 



1148 



IHsapiens CpG DNA 
(clone 9lc9, forward 
ZS6162 read cpg91c9.ftla . 



le-09 



1149 



Mus musculus 
JPontin52 mRNA, 
AF 100694 |completecds 



<NONE> 



<NONE> 



<NONE> 



le-09 



1002424 



(U25739) YSPL-1 form 1 [Mus 
mu sculus] 



8.9 



1150 



IHomo sapiens NKG5 
M85276 [ gene, complete cds 



le-09 



115 



M94065 



[Human 
Idihydroorotate 
[dehydrogenase 
ImRNA, 3' end. 



2315436 



(AF016447) No definition line 
found [Caenorhabditis elegansl 



8.3 



le-09 



1152 



1153 



Homo sapiens 
genomic CAG repeat 
element, clone 
AJ 13 1895 |60o2(25 0) 

' Human DNA 

sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs, 
Jexontrap, complete 
Z82 1 8 1 Isequence 



3892656 



(AB014464) MGC-24V [Mus 
musculus] 



6.2 



5e-10 



<NONE> 



<NONE> 



<NONE> 



5e-10 



728831 



!! ALU SUBFAMILY J 
WARNING ENTRY 



7.9 



1154 



1155 



Homo sapiens mRNA 
for putative 
AJ224442 I methyl transferase 



5e-10 



113667 



! ALU CLASS B WARNING 
ENTRY!!!! 0.15 



Homo sapiens RET 
finger protein-like 1 
antisense transcript, 
AJO 10230 Ipartial 



5e-10 



728S34 



• ALU SUBFAMILY SB2 
WARNING ENTRY 



0.006 



1156 



Homo sapiens 
silencer of death 
domains (SODD) 
AF11U16 ImRNA. complete cds 



5e-I0 



1157 



Z970I7 



IHomo sapiens mRNA 
Ifor hypothetical 
Iprotein 



4160014 



(AF1 11116) silencer of death 
domains [Homo sapiens] 



2e-08 



4e-10 



<NONE> 



<NONE> 



<NONE> 



WO 01/02568 



PCT/USO0/18374 



Nearest Neighbor (BlastN vs. Genbanlc) 
SEQ " " ' ~ 

ID I ACCESSION 



DESCRIPTION 



Nearest Neighbor (BlastX vs. N on .R c dunrtnnt Proteins) 



P VALUE I ACCESSION 



DESCRIPTION 



1158 1 AF001298 



11591 Yl 1395 



1160| U41096 



Homo sapiens type II , 



integral membrane 
protein 



4e-10 



H.sapiens mRNA for 
p40 



<NONE> 



2e-10 



Human non-coding 
sequence upstream 
from DOC-2 gene on 
chromosome 5 



1000340 



2e-10 



728837 



Sambucus nigra 
Iribosome inactivating 
■protein precursor 
1161 1 AF012899 mRNA, complete cds 
IS.cerevisiae ' 
Jchromosome II 
(reading frame ORF 
1162| Z36111 YBR242w 



6e-ll 



<NONE> 



6e-ll 



2213560 



1163 



ISchizosaccharomyces I 
jpombe mRNA. partial 
D89174 cds. clone: SY 100 4 | 6e-ll 
Human DNA 



3879758 



<NONE> 



(U34384) CheW [Borrelia 
burgdorferi] 



!!!! ALU SUBFAMILY SQ 
WARNING ENTRY 



<NONE> 



<NONE> 



2.4 



0.28 



<NONE> 



?97052) hypothetical protein 



1164 



Z95437 



(sequence from 
Icosmid Al on 
Jchromosome 6 
[contains ESTs. 
IHERV like retroviral 
sequence | 5 e - 1 1 



<NONE> 



Sambucus nigra 
ribosome inactivating 
protein precursor 
U65J AF012899 mRNA. complete cds J 



H66> X56997 



Human UbA:>2 gene 
coding for ubiquitin- 
|52 amino acid fusion 
[protein 



5e-ll 



3886065 



Homo sapiens full 
length insert cDNA 
1167 1 AF086253 Iclnnp ZD40G 1 2 



2e-Il 



<NONE> 



2e-ll 



21347S0 



AHvtzu) Similarity to yeast 
protein TREMBL ID E246895); 
cDNA EST EMBL:T00018 
comes from this gene; cDNA 
EST EMBL:C13908 comes 
from this gene; cDNA EST 
EMBL.C 11656 comes from this 
gene; cDNA EST yk234a5.3 
comes from this ge... 



<NONE> 



4e-30 



<NONE> 



(AF106581) contains similarity 
to C4-type zinc fingers 



<NONE> 



apoptosis inhibitor IAP homolog 
human 



4.9 



<NONE> 



3.S 



9H 



WO 01/02568 



PCT/US00/18374 





Nearest 


Neighbor (BlasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
[D 


ACCESSION 


i DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1168 


AB018314 


Homo sapiens mRN/ 
for KIAA0771 
protein, partial cds 


V 

2e-ll 


3024343 


P53-BINDING PROTEIN 
53BP2 Bbp/53BP2 [Homo 
sapjens] 


2e-U 


1169 


Z74972 


S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR064c 


2e-ll 


3041855 


(AC004537) similar to tumor 
suppressor p33INGl; similar to 
AF044076 (PID:g2829208) 
[Homo sapiens) 


2e-40 


1170 


Z82181 


Human DNA 
sequence from 
cosmid E86D10on 
chromosome 22. 
contains ESTs, 
exontrap, complete 
sequence 


7e-12 


<NONE> 


<NONE> 


<NONE> 


1171 


X77738 


H.sapiens red cell 
anion exchanger 
(EPB3. AE1, Band 3) 
gene, 3" region 


7e-12 


2135416 


hypothetical protein - human 
>gi|288145 


0.012 


1172 


S61977 


medium-chain acyl- 
CoA dehydrogenase 
{exon 10, intron 10) 
[human, Genomic, 
1407 nt] 


6e-12 


113666 


! ! ! ! ALU CLASS A WARNING 
ENTRY !!!! 


0.10O 


1173 


X662RS 


M.musculus DNA for 
HC 1 locus 


6e-12 


854065 


(XS3413) U88 [Human 
herpesvirus 6) 


2e-06 


1174 


S78744 


protein S=activated 
protein C cofactor 
>ats, liver, mRNA, 
3315 nt] 


6e-12 


2338292 


(AF009243) proline-rich Gla 
protein 2 [Homo sapiens) 


3e-10 


1175 


X58474 


Bovine OXT gene for 
loncoding region 


2e-12 


1296429 


L77967) small proline-rich 
protein with, paired repeat 


4.1 


1176 


] 
( 

Z56314 r 


4.sapiens CpG DNA, 
:lone lOhlO, reverse 
ead cpclOhlO.rtla . 


2e-12 


( 

2935221 ; 


AF030154) pVII [bovine 
idenovirus type 3) 


2.8 


1177 


I 

c 

Z56314 r 


Lsapiens CpG DNA, 
lone lOhlO, reverse 
ead cpglOhlO.rtla . 


2e-12 


( 

2708659 f 


AF037440) putative 26 kDa 
(rotein [Edwardsiella ictaluri] 


2.8 


1 178 


I* 

Z19543 c 


/l.musculus h2- 
alponin cDNA 


2e-12 


E 

( 

2497945 c 


JET A SCRUIN >gi| 1015535 
Z47541) beta scruin [Limulus 
olyphemus] 


2e-0-i 



¥37 



WO 01/02568 



PCT/US00/18374 



Nearest Neighbor (BlaslN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redund.-int Protei 



SEQ 
ID 



ACCESSION! DESCRIPTION | P VALUE | ACCESSION 



H79 1 S45332 



U80| AFO 12899 



erythropoietin 



DESCRIPTION 



receptor [human, 
placental. Genomic, 
8647 nil 



7e-l3 



!!!! ALU SUBFAMILY SC 
728835 IWARNING ENTRY 



Sambucus nigra 
ribosome inactivating I 
protein precursor 
mRNA. complete cds I 2e-13 



<NONE> 



<NONE> 



P VALUE I 



0.074 



118l| AFQ12899 



Sambucus nigra 
ribosome inactivating I 
protein precursor 
mRNA. complete cds 1 2e-13 



<NONE> 



<NONE> 



H82 1 Z59S09 



H.sapiens CpGDNA,| 
clone 15a 1, reverse 
read cpglSal.rtla . 



1183| D10170 



Human CYP1 1B2 
gene for steroid 18- 
hydroxylase 



(AL023634) hypothetical 
_2c-13 | J3L50251 protein 



2e-13 



1184 1 U65416 



Human MHC class I 
molecule (MICB) 
gene, complete cds 



!!!! ALU SUBFAMILY SQ 
728837 [WARNING ENTRY 



2e-13 



126295 



line- 1 reverse 
transcriptase 
(homolog 



<NONE> 



0.66 



3e-05 



6e-ll 



1185 1 AJ006031 



1186| U34976 



11 87 1 D30647 



1188| 263247 



1 189 1 U27196 



U90| M26219 



Mus musculus 
IHABP gene, 
promoter 



8e-14 



Human gamma- 
sarcoglycan mRNA, 
complete cds 



2132223 



(hypothetical protein YPL186c 
yeast 



8e-14 



1054903 



(U34976) gamma-sarcoglycan 
|[Homo sapiens] >gi|4239660 
Isapie ns] 

ACVL-COA 



Rat mRNA for very- 
long-chain Acyl-CoA | 
dehydrogenase, 
complete cds 



8e-14 



3183512 



DEHYDROGENASE, VERY- 
JlONG-CHAIN SPECIFIC 
J(VLCAD) >gi|2388724 
|(AF017176) very- long-chain 
lacyl-CoA dehydrogenase [Mus 
Imusculus] 



H.sapiens CpG DNA. 
clone 7g4. forward 
read cpe7g4.fla . 



6e-14 



86285 



Ihistone H1.01 - chicken 



Gal] us gallus zinc 
finger protein (Fzf-1) I 
mRNA. complete cds.| 3e-14 



African green 
monkey origin of 
replication 



zinc finger protein - chicken 
_2134436 |(fragment> 



2e-14 



<NONE> 



1.1 



0.034 



8e-23 



6.8 



4e-10 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



DESCRIPTION 



1191 1 AF100694 



1192 1 AF012899 



P VALUE 



Mus musculus 



Pontin52 mRNA, 
complete cds 



Sarnbucus nigra 
ribosome inactivating 
protein precursor 
mRNA. complete cds 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



ACCESSION 



2e-l4 



4235641 



2e-14 



3043728 



DESCRIPTION 



(AF 119040) NL0D 
[Lycopersicon esculentuml 



(AB01 1 174) KIAA0602 protein 
Homo sapiens] 



P VALUE 



0.65 



1193 1 AJ005866 



Homo sapiens mRNA 
for putative Sqv-7- 
like protein, partial 



2e-14 



4008517 



1194 1 U32709 



Haemophilus 
influenzae Rd section 
24 of 163 of the 
complete genome 



1195 > AF073485 



iorno sapiens MHC 
class I-related protein 
MR I precursor 
(MR1) gene, partial 
cds 



2c- 14 



3861056 



8e-15 



728831 



(AJ005866) Sqv-7-Iike protein 
[Homo sapiens 1 



(AJ235272) 
POLYRIBONUCLEOTIDE 
NUCLEOTIDYLTRANSFERA 
SE (pnp) [Rickettsia 
prowazekiil 



0.004 



!!!! ALU SUBFAMILY J 
WARNING ENTRY 



6e-28 



1196 1 AF052135 



Homo sapiens clone 
23625 mRNA 
sequence 



1197| AF 1 00694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



8e-15 



4098124 



(U73522) AMSH [Homo 
sapiens] 



8e-14 



3e-15 



<NONE> 



<NONE> 



<NONE> 



1198| AF012899 



Sarnbucus nigra 
ribosome inactivating 
protein precursor 
mRNA, complete cds 



3e-15 



113671 



!!!! ALU CLASS F WARNING 
ENTRY !!!! 



1199 1 Z75104 



S.cerevisiae 
chromosome XV 
reading frame ORF 
YOR196c 



3e-15 



3878570 



(Z46381) similar to lipoic acid 
synthase; cDNA EST yk283b6.3 
comes from this gene; cDNA 
EST yk283b6.5 comes from this 
gene; cDNA EST yk472f'5.3 
comes from this gene; cDNA 
EST yk472f5.5 comes from this 
gene: cDNA EST yk476e7.3... 



le-15 



WO 01/02568 



PCT/US00/18374 



SEQ 
ID 



Nearest Neighbor (BlastN vs. Genbank) 



ACCESSION 



12001X70052 



DESCRIPTION 



P VALUE 



S.cerevisiae sofl 
gene 



1201 | AF012899 



1202| M92295 



Sambucus nigra 
ribosomc inactivating 
protein precursor 
mRNA, complete cds 



Gorilla gorilla gamma 
1 and gamma-2 
globin genes, 
complete cds. 



1203 1 L34587 



1204 1 D83649 



1205| AC005190 



1206 1 J03626 



Homo sapiens RNA 
polymerase II 
elongation factor SIII 
pl5 subunit mRNA, 
complete cds. > :: 
:b|AR022286|AR022 
286 Sequence 7 from 
patent US 5792634 



Nearest Neighbor (BlastX vs. Non-Redundant PmteinO 



ACCESSION 



DESCRIPTION 



\(Wim) coded tor by C. 



3e-15 



1125754 



lelegans cDma cmI6f6; coded 
for by C. elegans cDNA 
CEESU63F; similar to S. 
cerevisiae SOF1 protein 
(SP:P33750) [Caenorhabditis 
elegansl 



2e-15 



<NONE> 



<NONE> 



le-15 



(hypothetical protein 2 - human 
284078 >gj| 182220 



Xenopus laevis 
mRNA for ,\Sox7 
protein, com plete cds 



Homo sapiens PAC 
clone DJ1152D16 
from Xq23; complete 
sequence [Homo 
sapiens] 



Human UMP 
synthase mRNA. 
complete cds. 



9e-16 



<NONE> 



<NONE> 



8e-I6 



(D83649) xSox7 protein 
2447043 fXenopus laevis] 



3e-16 



<NONE> 



<NONE> 



! ! ! ! ALU CLASS B WARNING 
3e-16 | 113667 I ENTRY !!!! 



P VALUE 



3e-29 



<NONE> 



7.4 



<NONE> 



4e-06 



<NONE> 



1207| J00083 



208 



U70674 



Human Alu family 
interspersed repeat; 
clone BLUR11. 



3e-16 



728836 



Mus musculus m- 
Numb (m-nb) mRNA. 
complete cds 



le-16 



<NONE> 



!!!! ALU SUBFAMILY SP 
WARNING ENTRY ' 



4e-06 



<NONE> 



<NONE> 



*fto 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor CBIasiN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1209 


U66619 


Human SWI/SNF 
complex 60 KDa 
subunit (BAF60cl 
mRNA. complete cds 


le-16 


1549247 


/T 1 CIS CU/DPurr - ■ r r 

(Uoooiy) SWI/SNF complex 6C 
KDa subunit [Homo sapiens] 


0.003 


1210 


U75467 


Drosophila 
mclanogaster Rga anc 

cds 


1 

le-16 


1658503 


(U/34o/j Atu [Drosophila 
melanogaster] 


5e-32 


1211 


M72709 ' 


Human alternative 
splicing factor 
mRNA, complete cds. 


3e-17 


<NONE> 


<NONE> 


<NONE> 


1212 


U26556 


f"flimnn fprrinn H 

1(11 1 It 1 J Hill 11 

(FTHL13) 
pseudogene. 


3e-17 


<NONE> 


<NONE> 


<NONE> 


1213 


D32064 


Human gene for 2- 

oxoglutarate 

ucnyurugcntibe, 

complete cds 


3e-17 


2088843 


(AF003386) F59E12.9 gene 
product [Caenorhabdius 
elesans] 


0.12 


1214 


M76364 


Human (Papua New 
Guinean) 

Mitochondrial DNA 
control region, 

OCIJUCIHC I J I. 


17 
JC" 1 / 


1 14UU9 


APAG PROTEIN 
>gi|72927|pir||BVECAG apaG 
protein - Escherichia coli 
>gi|40918 (X04711) URF 
hypothetical protein 
Escherichia coli] 


0.006 


1215 


AFO 17466 


Homo sapiens 
genomic sequence 
from subtelomeric 
region of 
chromosome 4q 


le-17 


3947985 


(U7S948) MADS-box protein 2 
Malus domestical 


4.1 


1216 


AF004876 


Homo sapiens 
54TMp (54tm) 
mRNA, complete cds 


le-17 


4101574 


(AF004876) 54TMp [Homo 
sapiens] 


0.006 


1217 


AF 100694 < 


vlus musculus 
Pontin52 mRNA, 
:omplete cds 


9e-18 


<NONE> 


<NONE> 


<NONE> 


1218 


I 

AF086758 1 


^attus norvegicus Na- 
<-2CI cotransporter 


4e-18 


3892703 1 


AL033545) putative glycine- 
•ich protein [Arabidopsis 
haliana] 


0.30 


1219 


1 

I 

AF020089 <. 


-lomo sapiens 
5 EN11B mRNA, 
omplete cds 


4e-18 


( 
t 

2642493 f 


AF0239IO) DNA 
opoisomerase I [Physarum 
ralycephalum] 


0.0S3 


1220 


1 

X82333 ( 


-l.sapiens IRLB gene 
exonl-3) 


4e-18 


i 

106837 ; 


rlB protein - human (fragment) 
»gt|33969 


2e-ll 



WO 01/02568 



PCT/US00/18374 



I SEQ 
ID I ACCESSION 



Nearest Neighbor (BlaslN vs. Genbank) 



DESCRIPTION 



1221 1 AB002383 



1222 1 X98485 



KIAA0385 gene, 
complete cds 



Human mRNA for 



P VALUE 



Nearest Neig hbor (BlastX vs. Non-Redundant Proteins) 
DESCRIPTION 



ACCESSION 



P.vivax PV14 eene 
H.sapiens flow-sorted 



chromosome 6 
HindlU fragment 
SC6pA21E8 



Homo sapiens (clone 
JH4Bl)PM-sc! 
Jautoantigen mRNA, 
1224 J L0 1457 complete cds. 

I Dog nonerythroid 



4e-18 



le-18 



3228540 



<NONE> 



1225 



Ibeta-spectrin mRNA 
. L02897 |3' end. 



(Homo sapiens mRNA 
[for APCL protein, 
1226 AB012162 complete cds 



[Homo sapiens mRNA 
forKIAA0521 
1227 AB011Q93 protein, partial cds 



J228j_X78454 



X.laevis AB21 
mRNA for RPD3 
Ihomologue 



1229 



1230 



1231 



1232 



U88895 



(Human endogenous 
(retrovirus H Dl 
(leader 

(region/integrase- 
derived ORF1, 
(ORF2, and putative 
(envelope protein 
mRNA, complete cds 



le-18 



le-18 



2981631 



346287 



4e-19 



3493358 



4e-19 



3894265 



4e-19 



3043566 



4e-19 



3023945 



P value! 



(AF06018I) zinc finger protein 
[Homo sapiens I ( 6e-25 



<NONE> 



I <NONE> | 



0.001 



(AB0I2223) ORF2 [Canis 
familiarisl 

nucleolar 1UUK. polymyositis- 
scleroderma protein - human 
>gi|35555 (X66 113) PM/Scl 
lOOkD nucleolar protein [Homo j 

sapiens! | o.OOl 

'(ABO 1 7037) nonstructural 



protein precursor [Himetobi P 
virus! 



0.12 



(ABO 12 162) APCL protein 
'Homo sapiens! 



0.002 



(ABO 11093) KIAA0521 protein | 
Homo sapiens! 



9e-09 



HISTONE DEACETYLASE 
(HP) thaliana! 



U34377 



(Human tyrosine 
kinase TXK (txk) 
Igene. e.xon 13. 



X72966 



|M.musculus rab3A 
gene 



2e-19 



59977 



le-19 



728831 



AB0Q7953 



Homo sapiens 
mRNA, chromosome 
1 specific transcript 
|kIAA04S4 



le-19 



2408076 



5e-34 



(Z14310) tripartite fusion 
transcript PLA2L [Human 
endogenous retrovirus! 



le-04 



!!! ALU SUBFAMILY J 
WARNING ENTRY 



4e-20 



<NONE> 



(Z99167) putative peroxisomal 
organisation and biogenesis 
protein fSchizosaccharomyces 
pombe) 



3e-05 



2e-09 



<NONE> 



<NONE> 



WO 01/02568 



PCT/US00/18374 



SE( 
ID 


I Neares 

. Iaccessio 


tNEShborCBIasiNvs-Uenbank) 1 

nL DESCRIPTION | p VALrrP \ Annmn „ 


>or (BlastX vs. Non-Redimriant Proteins) "1 

description p v \ r i rp I 


123: 


fj D 14034 


iHuman gene for Zn- 
Jalpha2-glycoprotein, 
jcomplete cds 


1 2e-20 1 3928756 


|(ABU)li33) similar to 

[t-' elegans hypothetical protein 

CET0 1 H8. 1 .CEC05C 12.3 ,CEF 

4D1.5. similar to trp and trp-lilc 

proteins fHomo sapiens) 

DNA-binding protein - mouse 

>eil437444 


5| 1 
el J 

le-07 I 

le-19 I 


1234 


1 X82126 


IH.sapiens HOK-2 
Igene. exon 2 


2e-20 1 2IT77M 


1235 


1 AF093684 


[Luciferase reporter 
(vector pXP2 *SA,. 
[complete sequence 


. j KAF04 1382) microtubule 
■1 i 1 2773363 Ibindine nrnrein n,n m ian 


5.5 1 


1236 


J05272 


Human IMP 
dehydrogenase type 1 
mRNA complete cds. 


1 f 

I J 

1 1 

1 1 

1 5e-21 1 124417 \< 


NOS1NE-5'- 

VIONOPHOSPHATE 
DEHYDROGENASE 1 (IMP 
DEHYDROGENASE I) 
IMPDH-0 (IMPD 1) I - human 




1237 1 


1 
1 

D86997 Ic 


Human (lambda) 1 
DNA for 1 
mmunoglobulin light 1 
•hain 1 


1 ( 

1 r 

J Ic 
1 l C( 

I r 

I Jg« 

1 l cc 

5e-21 | 3878961 Irr 


Z75712) Similarity to S. Pombe 
EM1/BUD5 suppressor; 
DNA EST EMBL:Z14470 
smes from this gene; cDNA 
ST yk482d4.3 comes from this 
:ne; cDNA EST yk482d4.5 I 
>mes from this gene I 

'aennrhrihrliric ^Im-mc-i 1 


[ 2e-04 J 
6e-46 1 


1238| 


r 

Ic 
lis 

M 
3C 

Z79865 |3C 


.sapiens 1 
iromosome 22 CpG 
land DNA genomic 1 
sel fragment, clone 1 
)2f3, forward read 1 
)2f3.f 1 


1 (AF024614) ADAM 10 
I J[Caenorhabditis elegans] Zinc- 
I (binding metalloprotease domain; 
I IcDNA EST CEMSA42F comes ' 
| [from this gene; cDNA EST 
1 Jyk2 ISf3. 3 comes from this aene; 
| IcDNA EST yk443d9.3 comes 
J J from this gene; cDNA EST 
_ 1 yk443d9.5 comes from this 
2e-21 | 2739037 |eene; cDNA 1 


2.6 | 



WO 01/02568 



PCT/USOO/18374 




WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genh.inU 




SEC 
ID 


2 

ACCESSIO 


N DESCRIPTION 
Human UNA 


P VALUE 


nearest iNeic 
ACCESSION 


tit>or(BlasLX vs. Non-Redundant I 
DESCRIPTION 


*roteins) 
P VALUE 


124S 


1 Z69654 


sequence from 
cosmid L98A6. 
Huntington's Diseast 
Region, chromosom 
4pl6.3. 


3e-24 


4240566 


(AF 123462) neurexin in [Horn 
sagiens] 


3 I 

f 4.5 


1249 


1 AB007914 


Homo sapiens mRN; 
for KIAA0445 
(protein, complete cd< 


\ 1 

2e-24 1 


3885949 


(AF095568) amelogenin 
[Paleosuchus palpebrosusl 


3.2 


1 1250 


AF088072 


Homo sapiens full 
length insert cDNA 
clone ZD93D10 


! 2e-24 1 


323091 


immunodominant microneme 
protein EtplOO - Eimeria tenella 
>gi|2707733 (AFO32905) 
microneme protein precursor 
. §tn} ic ;- 1 [Eimeria tenellal 


0.34 


1251 


AF069489 


Homo sapiens cAMf 
specific 

phosphodiesterase 4A 
variant pde46 
(PDE4A) gene, exons 
2 through 1 3 and 
alternative splice 
exons 3a. 6a, 6b, and 
9a 


2e-24 I 


728836 


!!!! ALU SUBFAMILY SP 
WARNING ENTRY 


le-05 


1252 1 


i 1Z053 


Homo sapiens P2X7 
gene, exon 4-8 


9e-25 1 


728831 


!!.'! ALU SUBFAMILY J 
WARNING ENTRY 

<NONE> 1 


le-05 
<NONE> 


1253 I 




Human 28S I 
ribosomal RNA gene, 
complete cds. 


8e -25 1 


<NONE> 


1254 1 


AB007953 1 


lomo sapiens 1 
tiRNA, chromosome 1 
specific transcript 1 
CIAA04S4 1 


8e-25 J 


<NONE> 


<NONE> 


<NONE> 


12551 


F 

c 

260212 r 


Lsapiens CpG DNA.I 
lone 195c8, forward 
sad cpsl95cS.ftla. 


8e-25 1 


( 

158154 r 


M81959) POU domain protein 
Drosophila melanogaster] [ 


3.3 


12561 


Ik 
P 

AF 100694 c 


lus musculus 
ontin52 mRNA, 
omplete cds 


7e-25 1 


<NONE> 


<NONE> < 


:NONE> 


12571 


N 
P 

AF 100694 ci 


lus musculus 
*>ntin52 mRNA, 
implete cds 


7e-25 I 


<NONE> 


<NONE> (< 


NONE> 


1258 1 


H 

Y 1285 1 jo 


omo sapiens P2X7 
•ne, exon 1 and 
ined CDS 


2e-25 J 


<NONE> 


<NONE> 1 < 


NONE> 



WO 01/02568 
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SEQ 
ID 


Nearest Neiahbor fBlastN vs. 

accession! description 

lf\/TllC mitc/>tilnr Taro 


Genbank) 
P VALUI 


1 Nearest Neigh 
: 1 ACCESSION 


bor fBlastX vs. Non-Redundant 1 
DESCRIPTION 


°roteins) 
P VALUE 


12591 U64033 


(Tera) mRNA, 
complete cds 


9e-26 


1 <NONE> 


<NONE> 


<NONE> 


12601 U19181 


IRattus norvegicus 
|Rabin3 mRNA. 
|complete cds. 


9e-26 


f 624225 


(U19I81) Rabin3 [Rattus 
Inorvegicusl 




12611 AF020788 


ICaenorhabditis 
lelegans SEL-10 (sel- 
10) mRNA, complete 
Icds 


9e-26 


1 3915881 


SLL- IU PKU 1 bIN LafioTaa 

CDC4 gene (TR:E234056); 
JcDNA EST EMBL:D27699 
[comes from this gene; cDNA 
EST EMBL:D27698 comes 
Jfrom this gene; cDNA EST 
EMBL:D32793 comes from this 
gene; cDNA EST 
EMBL.D33271 comes from this 
gen... 


le-13 
■ 

7e-32 


12621 AB01693O 


ICricetuIus griseus 
mRNA for 

[Phosphatidylglycerop 
hosphate synthase. 
Icomplete cds 


8e-26 


4159682 


(ABO 16930) 

Phosphatidylglycerophosphate 
synthase fCricetuIus griseus] 


0.045 


12631 AF100694 


Mus musculus 
Pontin52 mRNA, 
complete cds 


3e-26 


J 

3878629 


(Z93385) predicted using 
Genefinder; Similarity to 
B.subtilis GTP-binding protein 


2e-10 




12641 X91195 '' 


H.sapiens SOM172 
mRNA 


le-26 


<NONE> 


<NONE> 


<NONE> 




Ljloj I A-T lUUoy4 1 


Mus musculus 
Pontin52 mRNA, 
complete cds 


le-26 1 


1360637 \( 


X959951 ENBPi rviri.-. cnt;«,oi 


3. 1 




12661 L08237 


Human MG2 1 
mRNA. partial cds. 


le-26 1 


(L08237) located at OATL1 
950411 [Homo sapiensl 


9e-09 1 


1 


1 h 

1 P 

267 1 AF 100694 c 


4us musculus 
ontin52 mRNA, 
omplete cds 


9e-27 1 


(AL032657) similar to EGF-like 
domain: cDNA EST yk299al2.3 
comes from this gene; cDNA 
EST EMBL:D35398 comes 
from this gene; cDNA EST 
Iyk331h6.5 comes from this 
gene; cDNA EST yk299al2.5 
(comes from this gene; cDNA 
3881080 ESTvk467eS . 


0.001 




1 N 

1 P 

268 1 AFI00694 c< 


lus musculus 
ontin52 mRNA, 
smplete cds 


8e -27 1 


H 

1731324 |> u 


rPOTHETICAL PROTEIN 
i| 166306 


4.0 



WO 01/02568 



PCT/US00/18374 



SEC 
ID 


I Nearest 
M 

\ accessio: 


Neighbor (BlastN vs. 
W DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neigh 
ACCESSION 


bor (BlastX vs. Non-Redundant I 
DESCRIPTION 


'roteins) 
P VALUE 


126? 


I X89211 


H sanipnt DMA for 
endogenous retrovira 
like element 


1 

8e-27 


2065209 


(Y12713) Gag polyprotein [Mu 
musculusl 


s 

0.005 


1270 


1 U73166 


Homo sapiens cosmic 
clone LUCA 15 from 
3p21.3, complete 

sapiens] 


1 

3e-27 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


4e-04 


1271 


I D78255 


PAP- 1. complete cds 


3e-27 


1850098 


(D78255) PAP-1 [Mus 
musculus] 


2e-10 


1272 


AF 100694 


Mus musculus 

rOrulnj/ ITLKiNA, 

complete cds 


le-27 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitorl 


0.39 


1273 


AB015202 


Homo sapiens gene 
for hippocalcin, exon 
2, 3 and complete cds 


le-27 


3877698 


(Z83318) predicted using 
Genefinder; cDNA EST 
yk369e7.5 comes from this gene 
Caenorhabditis elegans] 


0.37 


1274 


AF 100694 


Pontin52 mRNA. 
complete cds 


le-27 


3328188 


(AF074902) laminin alpha chain 
Caenorhabditis eleaans] 


0.19 


1275 


Z29336 


H.sapiens gene for 
Cu/Zn-superoxide 
dismutase 


le-27 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


6e-05 


1276 


AF1 00694 


\fl lie m 1 10/* 1 1 1 ■ ■ t- 

Pontin52 mRNA, 
complete cds 


9e-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


9.2 


1277 1 


AF 100694 


Mus musculus 
Pontin52 mRNA, 
rompleie cds 


9e-28 


2133579 


spermatophorin Sp23 - yellow 
mealworm molitor] 


0.054 


12781 


i 

t 
I 

AB001636 c 


-fomo sapiens mRNA 
or ATP-dependent 
JNA helicase #46. 
omplete cds 


4e-28 


" 

1 
1 

( 

3913425 i 


HUl'ATlVhFRK-MRNA 
SPLICING FACTOR ATP- 
DEPENDENT RNA 
-IELICASE >gi|2275203 
AC002337) RNA helicase 
soloe [Arabidopsis thaliana] 


3e-22 


1279| 


N 
P 

AF 100694 c 


4us musculus 
ontin52 mRNA, 
omplete cds 


3e-28 


( 
r 

0 

c 

g 

E 
c 

4056454 f 


AUOU3990) Contains repeated 
egion with similarity to 
b|U43627 extensin (atExtl) 
ene from Arabidopsis thaliana. 
5Ts gb|Z34165 and gb|Zl8788 
ome from this aene. 
\rabidopsis thaliana] 


0.066 



WO 01/02568 
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SEQ 
ID 



rarest Neighbor (BlastN vs . Genbank) , — ^ Neigh ^ ^ -. 



ACCESSION 



DESCRIPTION 



1280| AF 100694 



281 1 AF 100694 



Mus musculus 
Pomin52 mRNA, 
complete cds 
Mus musculus 
Pontin52 mRNA. 
complete cds 



1282 1 AF 100694 



t283| AF 100694 



Mus musculus 
Pomin52 mRNA, 
complete cds 



1284 1 AF100694 



1285| AF100694 



Mus musculus 
Pontin52 mRNA 
complete cds 



P VALUE 1 ACCESSION 



DESCRIPTION 



HAiwawu) Contains repeatecf 



3e-28 



le-28 



4056454 
<NONE> 



region witti similarity to 
jgb|U43627 extensin (atExtl) 
[gene from Arabidopsis thaliana. 

ESTs gb|Z34165 and gb|218788 
[come from this gene. 

[Arabidopsis thalianal 



<NONE> 



le-28 



<NONE> 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Mus musculus 
Pontin52 mRNA 
complete cds 



1286| AF 100694 



1287 1 AF10Q694 



Mus musculus 
Pontin52 mRNA 
complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



1288| AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



<NONE> 



le-28 



<NONE> 



<NONE> 



le-28 



<NONE> 



<NONE> 



P VALUE I 



4e-05 



<NONE> 1 



<NONE> 1 



<NONE> I 



le-28 



<NONE> 



<NONE> 



le-28 



<NONE> 



<NONE> 



le-28 



PROBABLE INTRON 
MATURASE liverwort 
(Marchamia polymorpha) 
140505 chloroplast >gj| 1 1663 



le-28 



140505 



PROBABLE INTRON 
MATURASE liverwort 
I (Marchamia polymorpha) 
Ichloroplast >gi|l 1663 



<NONE>| 



<NONE> 



<NONE> 



3.0 



1289 1 AF1 00694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



2133579 



Ispermatophorin Sp23 - yellow 
[mealworm molitorl 



1290| AF 100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



4056454 



l_1291 | Z63029 



H.sapiens CpG DNA, 
clone 77b3, forward 
read cp°77b3.ftla . 



le-28 



2493240 



RACOtbyyO.) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thalianal 



HYPOTHETICAL 29.3 KD 
PROTEIN pseudotsugata 
nuclear. polyhedrosis virus] 



0.50 



0.0S7 



0.014 



WO 01/02568 
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SEQ 
ID 



Nearest Neighbor (BlastN vs. Genhanlcl 



ACCESSION DESCRIPTION 



1292| AF 1QQ694 



1293| AF100694 



Nearest Neighbor (BlastX vs. Nnn-R^i.,^.-. o r »^f 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



1294 1 AF 1 00694 



1295 1 AF100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



1296| AF IQ0694 



1297 1 AFl QQ694 



1298| AFl 00694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



P VALUE APPFsgfOM 



le-28 



118588 



le-28 



4056454 



le-28 



4056454 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



126363 



le-28 



4056454 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



4056454 



DESCRIPTION 



TJbHYDRIN DHN3 



|P VALUE I 



>gi|lUUU35|pir||M8139 dehydrin 
DHN3 - garden pea >gi|20709 
(X63063) pea dehydrin DHN3 

[Pisum sativuml 

IXCUtbyyO) Contains repeated 



region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|2 18788 
come from this gene. 
[Arabidopsis thaliana! 
tAcuuayyo) Contains repeated 



region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thalianal 



0.010 



0.007 



LAMININ ALPHA- 1 CHAIN 
PRECURSOR precursor - 
human 



le-28 



3157926 



1299 1 AFl QQ694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



le-28 



4056454 



( A<_UO:>990) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thalianal 
(ACuLDyyO) Contains repeated" 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 1 65 and gb|Zl 8788 
come from this gene. 
[Arabidopsis thaliana 



(AUJU2131) Strong similarity to 
extensin-like protein gb|Z34465 
from Zea mays. [Arabidopsis 
.thaliana] 

(ACOlbWO) Contains repeated 
region with similarity to 

b|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
Arabidopsis thalianal 



0.002 



3e-04 



le-04 



3e-05 



2e-05 



ftfq. 



le-05 
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SEQ 

*P I ACCESSION 



Nearest Neighbor (BlastN vs. Genbank) 



1300 1 AF 100694 



1301 | AF 100694 



DESCRIPTION 



Nea rest Neighbor (BlastX vs. Non-ReH-nH-n, p^.. ~ 



Mus musculus 
Pontin52 mRNA, 
complete cds 



P VALUE I ACCESSION 



le-28 



Mus musculus 
Pontin52 mRNA, 
complete cds 



1302 1 AF1QQ694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



1303| API 00694 



1304 1 AF1 00694 



1305 | API 00694 



306 1 AF100694 



Mus musculus 
Pontin52 mRNA, 
complete cds 



DESCRIPTION 



Jkinetoplast-associated protein - 



| P VALUEI 



320919 



le-28 



4056454 



le-28 



4056454 



le-28 



4056454 



[Trypanosoma cruzi >gi|I62142 
[(M25364) kinetoplast-associated 

protein le _Q7 

(ACtXbiWO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 

■ [ ^f?. b . ido r.?!f.' haHana1 1 9e 08 

RACUU^yyO) Contains repeated - 
[region with similarity to 
Jgb|U43627 extensin (atExtl) 
'gene from Arabidopsis thaliana. 
ESTs gb|Z34I65 and gb|Z18788 j 
(come from this gene. 
[Arabidopsis thalianal 
KATJDD359D) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
[gene from Arabidopsis thaliana. 
[ESTs gb|Z34165 and gb|Zl8788 
come from this gene. 
[Arabidopsis thaliana] 

Contains repeated 



le-09 



9e-10 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Mus musculus 
Pontin52 mRNA, 
complete cds 



Mus musculus 
Pontin52 mRNA, 
plete cds 



le-28 



4056454 



le-28 



4056454 



le-28 



4056454 



[region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z 18788 
come from this gene. 
[Arabidopsis thaliana 



4e-10 



(At_uuayyu) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34 1 65 and gb|Z 1 8788 
come from this gene. 

.[Arabidopsis thaliana] I 9e- 1 1 

(ACOtbyVO) Contains repeated 
region with similarity to 
gb|U43627 extensin (atExtl) 
gene from Arabidopsis thaliana. 
ESTs gb|Z34165 and gb|Z1878S 
come from this gene. 
[Arabidopsis thalianal I 6e-ll 



WO 01/02568 



PCT/US00/18374 



SEC 
ED 


Nearcs 

1 

ACCESSIOI 


Neighbor (FJlastN vs. 

V DESCRIPTION 
Mus musculus 


Genbank) 
| P VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant Proteins) 

DESCRIPTION | p VALUE 


130' 


'I AF 100694 


Pomin52 mRNA. 
complete cds 


1 4e-29 


f <NONE> 


I <NONE> 


<NONE> 


1308 


1 AF079529 


Homo sapiens cAMP 
specific 

phosphodiesterase 8E 


_ 1 

t 4e-29 


I <NONE> 


I <NONE> 


<NONE>| 


1309 


X93334 


H.sapiens 

mitochondrial DNA, 
complete senome 


4e-29 


116977 


TL'Y'l OCHKOMt C OXIDASE 
POLYPEPTIDE I chain I - 
(human mitochondrion (SGC1) 
>gi| 13006 (V00662) cytochrorm 
oxidase I [Homo sapiens] 
>gi|506829 (JO 14 15) 
cytochrome oxidase subunit 1 
[Homo sapiens] sapiens] 


3e-09 | 


1310 


AF020760 


Homo sapiens serine 
protease (Omi) 
mRNA, complete cds 


4e-29 


2738915 


(AF020760) serine protease 
[Homo saDiensl 


8e-12 J 


mill 


U95097 


Xenopus laevis 
mitotic 

phosphoprotein 43 
mRNA, partial cds 


4e-29 


(U95097) mitotic 
phosphoprotein 43 [Xenopus 
2072294 laevisl 


le-25 j 


1312 


L32162 


Homo sapiens 
transcription factor j 
mRNA, 5' end. 


2e-29 


RENAL TRANSCRIPTION 
FACTOR KID- 1 finger protein 
2501706 [Mus musculus] | 


8e-15 1 


1 13131 


] 
1 

AF1 00694 c 


Mus musculus 1 
:> ontin52 mRNA, I 
.•omplete cds i 


le-29 1 


(AC<Xb!W0) Contains repeated 
Iregion with similarity to 
Jgb|U43627 extensin (atExtl) 1 
gene from Arabidopsis thaliana. 1 
ESTs gb|234165 and gb|Z 18788 
come from this gene. 
4056454 [Arabidoosis thalianal 


le-04 I 


13141 


n 

F 

AF 100694 c 


rtus musculus 1 
'ontin52 mRNA, 1 
omplete cds ( 


Ie-29 I 


f 

If 

( 
p 

1 169643 D 


•MrFamIdE-ReLaTEd" f 

NEUROPEPTIDES | 
RECURSOR >gi|4 16208 
J03137) neuropeptide 1 
recursor FMRFamide-related 
eptide [Lvmnaea staanalis] | 


le-05 1 


1315| 


h 
P 

U50839 c 


bmo sapiens gl6 
rotein (gl6) mRNA, 
amplete cds | 


le-29 1 


3212101 p 


VF0695 17) RNA binding J 
■otein DEF-3 [Homo sapiens] 


6e-I0 | 



ft) 



WO 01/02568 



PCT/US00/18374 





Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












intercellular adhesion molecule 




1316 


X6971 1 


H. sapiens mRNA for 
ICAM-R 


5e-30 


299356 


3. ICAM-3=lymphocyte 
function-associated antigen 1 
counter-receptor homolog 
[human, tonsil. Peptide Partial, 
518 aa] 


3e-08 


1317 


AFO 10227 


Homo sapiens 
receptor-associated 
coactivator 3 


5e-30 


2331250 


(AF012108) Amplified in Breast 
Cancer [Homo sapiens] 


8e-09 


1318 


AF086395 


Homo sapiens full 
length insert cDNA 
clone ZD75C01 


2e-30 


3861241 


(AJ235273) CELL SURFACE 
ANTIGEN (sca5) 


4.2 


1319 


M27830 


Human 28S 
ribosomal RNA gene, 
complete cds. 


2e-30 


1730522 


PHOSPHOGLYCERATE 
KINASE 2.7.2.3) - Pyrococcus 
woesei >gi| 1054832 (X73527) 
phosphoglycerate kinase 
[Pyrococcus woesei] 


3.8 


1320 


M79307 


Mouse GTP-binding 
protein (Rabl7) 
mRNA sequence. 


2e-30 


464564 


RAS-RELATED PROTEIN 
RAB-17 Rabl7 - mouse 
(fragment) >gi|297157 
(X70804) rabl7 [Mus musculus] 


9e-ll 


1321 


AL022168 


Human UNA 
sequence from clone 
U247E12 on 
chromosome Xq22- 
23, complete 
sequence [Homo 
sapiens] 


le-30 


2072967 


(U93570) putative pl50 [Homo 
sapiens] 


3e-ll 


1322 


X85124 


M.musculus pacsin 
gene 


le-30 


2217964 


(Z50798) p52 [Gallus gallus] 


le-34 


1323 


U37408 


Homo sapiens 
phosphoprotein CtBP 
mRNA. complete cds 


5e-31 


74518 


structural polyprotein - 
Venezuelan equine encephalitis 
virus (strain TRD) >gi|323710 
(J04332) poly-envelope protein 
Venezuelan equine encephalitis 
virus] 


1.1 


1324 


L04193 


Human lens 
membrane protein 
(mpl9) gene, exon 
11. 


2e-31 


728831 


!!!! ALU SUBFAMILY J 
WARNING ENTRY 


7e-07 


1325 


M11167 


Human 28S 
ribosomal RNA gene. 


6e-32 


<NONE> ! 


<NONE> 


<NONE> 
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I Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins') 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1326 


M33336 


Human cAMP- 
dependent protein 
kinase type I-alpha 
subunit (PRKAR1 A) 
mRNA, complete cds 


2e-32 


<NONE> 


<NONE> 


<NONE> 


1327 


.103060 


Human 

glucocerebrosidase 
pseudogene, complete 
cds 


2e-32 


2144479 


glucosylceramidase (EC 
3.2.1.45) precursor - human 


le-05 


1328 


U33053 


Human lipid- 
activated protein 
kinase PRKI mRNA, 
complete cds 


7e-33 


2137689 


protein kinase (EC 2.7.1.37) - 
mouse 


le-14 


1329 


J04617 


Human elongation 
racior n^-i-aipna 
gene, complete cds. > 
:: dbj|E02629|E02629 
DNA of human 
polypeptide chain 
elongation factor- 1 
alpha 


6e-33 


<NONE> 


<NONE> 


<NONE> 


1330 


L40396 


Homo sapiens (clone 
s22i71) mRNA 
fragment 


6e-33 


124235 


INTERMEDIATE FILAMENT 
PROTEIN B protein B - 
common roundworm 


1.00 


1331 


Z72813 


S.cerevisiae 
chromosome VII 
reading frame ORF 
YGR028w 


6e-33 


1709135 


MSP1 PROTEIN HOMOLOG 
Yeast MSP 1 protein (TAT- 
binding homolog 4) 


8e-50 


1332 


AB007941 


Homo sapiens mRNA 
for KIAA0472 
protein, partial cds 


2e-33 


1 150834 


(U42471) Wiscott-Aldrich 
Syndrome protein homolog 
Mus musculus] 


2.0 


1333 


AF044574 


Rattus norvegicus 
putative peroxisomal 
2,4-dienoyl-CoA 
reductase (DCR- 
AKL) mRNA, 
complete cds 


2e-34 


4105269 


(AF044574) putative 
peroxisomal 2,4-dienoyl-CoA 
reductase [Rattus norveaicus] 


6e-15 


1334 


D14657 


-luman mRNA for 
KIAA0101 gene, 
:omplete cds 


7e-35 


<NONE> 


<NONE> 


<NONE> 


1335 


X69910 


L sapiens p63 mRNA 
F or transmembrane 
srotein 


7e-35 


2136323 


rithorax homolog HTX - human 
'fragment) homolog=MLL 
alternative splicing, clone 14p- 
18B) 


0.94 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Homo sapiens 










1336 


AF053455 


tetraspan TM4SF 
(TSPAN-5) gene, 
complete cds 


7e-35 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


le-25 


1337 


X58374 


D.melanogaster cm 
mRNA 


3e-35 


1 17478 


CROOKED NECK PROTEIN 


6e-41 


1338 


AF086492 


Homo sapiens full 
length insert cDNA 
clone ZD95DI 1 


9e-36 


2909809 


(AF031328) aminoglycoside 6'- 
N-acetyltransferase It 


1.9 


1339 


Z96223 


H. sapiens telomeric 
DNA sequence, clone 
12PTEL 120, read 
12PTELOO120.seq 


3e-36 


2408068 


(Z99165) hypothetical protein 


0.61 


1340 


Z37986 


H. sapiens mRNA for 
phenylalkylamine 
binding protein. 


le-36 


1362793 


emopamil-binding protein - 
human >gi|780263 


5e-ll 






Human ribosomal 
protein S27 mRNA. 
complete cds. end 
similar to similar to 
metallopanstimulin 1 










1341 


U57847 


> :: 

gb|AA316327|AA316 
327 EST188061 HCC 
cell line (matastasis to 
liver in mouse) II 
Homo sapiens cDNA 
5' end similar to 
similar to 

metallopanstimulin 1 


3e-37 


1171014 


40S RIBOSOMAL PROTEIN 
S27 growth factor-inductble zinc 
finger protein MPS-1 - human 
>gi|431319(L19739) 
metallopanstimulin [Homo 
sapiens] >gi|1373421 (U57847) 
ribosomal protein S27 ' 


1.4 


1342 


Y 15054 


Rattus norvegicus 
mRNA for 70 kDa 
tumor specific 
antigen, partial 


3e-37 


3123027 


70 KD WD- REPEAT TUMOR- 
SPECIFIC ANTIGEN 
>gi|2505957|gnI|PID|e353992 
(Y 15054) 70 kD tumor-specific 
antigen [Rattus norvegicus] 


2e-15 


1343 


AF084205 


Rattus norvegicus 
serine/threonine 
protein kinase TAOl 
mRNA. complete cds 


3e-37 


3452473 


(AF084205) serine/threonine 
protein kinase TAOl [Rattus 
norvegicus] 


5e-4" 


1344 


X78604 


R. norvegicus 
(Sprague Dawlcy) 
ARL5 mRNA for 
ARF-like protein 5 


le-37 


<NONE> 


<NONE> 


<NONE> 



H5^ 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) . 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1345 


AJ236644 


Homo sapiens 
chromosome 22 CpG 
island DNA, genomic 
Msel fragment, clone 
22CGIB49A3 . 
complete read 


le-37 


2239219 


(Z97210) hypothetical protein 


6e-05 


1346 


U09367 


Human zinc finger 
protein ZNF136 


4e-39 


2137269 


DNA-binding protein - mouse 
> g i|437444 


7e-23 


1347 


Z69649 


Human DNA 
sequence from 
cosmid L69F7B, 
Huntington's Disease 
Region, chromosome 
4pl6.3 contains 
Huntington Disease 
(HD) gene. 


3e-39 


3096918 


(AL023094) putative cyclase 
associated protein CAP 
[Arabidopsis thaliana] 


5.6 


1348 


AF065389 


Homo sapiens 
tetraspan NET-4 
mRNA. complete cds 


le-39 


3152703 


(AF065389) tetraspan NET-4 
[Homo sapiens] 


6e 29 


1349 


AF038172 


Homo sapiens clone 
23923 mRNA 
sequence 


le-40 


1813464 


(U60883) CapC [Bacillus 
firmus] 


2.8 


1350 


Z83095 


H.sapiens Fanconi 
anaemia group A 
gene, exons 39, 40, 
41,42 and 43 


le-40 


2137870 


zinc finger protein - mouse 
(fragment) 


3e-23 


1351 


AF057734 


Homo sapiens 17- 
beta-hydroxysteroid 
dehydrogenase IV 
(HSD17B4) gene, 
exon 16 


le-40 


2842416 


(AL00873O) dJ487J7. 1. 1 
(putative protein dJ487J7.1 
isoform 1) [Homo sapiens] 


6e-61 


1352 


AF070567 


Homo sapiens clone 
24544 beta- 
dystrobrevin mRNA. 
partial cds 


4e-41 


3133087 


(Y15718) dystrobrevin B DTN- 
B2 [Homo sapiens] 


7e-13 


1353 


AF006088 


Homo sapiens Arp2/3 
protein complex 
subunit p 16- Arc 
(ARC 16) mRNA, 
complete cds 


2e-4l 


3121767 


ARP2/3 COMPLEX 16 KD 
SUBUNIT 


3e-36 


1354 


X69942 


M.musculus mRNA 
of enhancer-trap- 
locus 1 


6e-42 


2291152 


(AF01641S) No dermition line 
found [Caenorhabditis elegans] 


6.4 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1355 


X87838 


H.sapiens mRNA for 
beta-catenin 


5e-42 


1373019 


(U2881 1) cysteine-rich 
fibroblast growth factor receptor 


8c-05 


1356 


AB018268 


Homo sapiens mRNA 
for KIAA0725 
protein, partial cds 


5e-42 


3882171 


(ABO 18268) KIAA0725 protein 
[Homo sapiens] 


2e-33 


1357 


M 84424 


Human cathepsin E 
(CTSE) gene, exon 9 
and complete cds. 


2e-42 


<NONE> 


<NONE> 


<NONE> 


1358 


U80776 


Human EST clone 
NIB 1543 mariner 
transposon Hsmarl 
orf gene, complete 
cds 


2e-42 


2231380 


(U807/6) ort; encodes putative 
chimeric protein with SET 
domain in N-terminus with 
similarity to several other 
human, Drosophila, nematode 
and yeast proteins [Homo 
sapiens] 


3e-ll 


1359 


U55184 


Human G protein 
Golf alpha gene, exon 
12 and complete cds 


2e-42 


3165531 


(AF067608) No definition line 
found [Caenorhabditis elegans] 


le-16 


1360 


AC005190 


Homo sapiens PAC 
clone DJI152D16 
from Xq23, complete 
sequence [Homo 
sapiens] 


6e-43 


2978255 


(AB007407) myeloid zinc finger 
protein-2 [Mus musculus] 


2.3 


1361 


ABO 18284 


Homo sapiens mRNA 
for KIAA0741 
protein, complete cds 


5e-43 


<NONE> 


<NONE> 


<NONE> 


1362 


AB011137 


Homo sapiens mRNA 
for KIAA0565 
protein, complete cds 


5e-43 


3043654 


(AB01U37) KIAA0565 protein 
[Homo sapiens] 


Ie-U/ 


1363 


M93651 


Human set gene, 
complete cds. 


2e-43 


<NONE> 


<NONE> 


<NONE> 


1364 


Z47087 


H.sapiens mRNA for 
RNA polymerase II 
elongation factor-like 
protein. 


2e-43 


1872514 


(U84404) E6-associated protein 
E6-AP/ubiquitin-protein ligase 
[Homo sapiens] >gi|2361031 
(AF01670S) E6-AP ubiquitin- 
protein ligase [Homo sapiens] 


7.2 


1365 


U27197 


Drosophila 
melanogaster pelota 
(pelo) mRNA. 
complete cds 


2e-43 


1352736 


PELOTA PROTEIN >gi|973224 
(U27197) pelota [Drosophila 
melanogaster] 


le-46 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












RRP5 PROTEIN HOMOLOG 




L JUU 


D80007 


Human mRNA for 
r.1 AAU I o J gene, 
partial cds 


6e-44 


2498864 


(KIAA0185) hypothetical 
protein i mvvjv. l ii— or 
S.cerevisiae. [Homo sapiens] 


6e-09 




AF005039 


Homo sapiens 
secretory carrier 
membrane protein 
(SCAMP3) mRNA, 
complete cds 


6e-44 


2232243 


(AF005039) secretory carrier 
membrane protein [Homo 
sapiens] 


2e-09 


1368 


X68101 


R.norvegicus trg 
mRNA 


2e-44 


550420 


(X68101) trg gene product 
[Rattus norveaicus] 


le-37 


1369 


AF044206 


Homo sapiens 
cyclooxygenasc 
(COX-2) gene, 
promoter and exon 1 


2e-45 


2072953 


(U93565) putative pl50 [Homo 
sapiens] 


5e-06 


1370 


L48708 


Homo sapiens 
faciogenital dysplasia 
(FGD1) gene, 5' end 
of intron 17 


8e-46 


<NONE> 


<NONE> 


<NONE> 


1371 


X 15822 


Human COX VIIa-L 
mRNA for liver- 
specific cytochrome c 
oxidase (EC 1.9.3.1 ) 


3e-46 


117121 


C V 1 UC Hk (J VLb C UXiDASL " 
POLYPEPTIDE VIIA-LIVER 
PRECURSOR 
>gi|2144370|pir||OSHU7L 
cytochrome-c oxidase (EC 
1.9.3.1) chain Vila precursor, 
hepatic - human >gi|30147 
(X15S22) precursor (AA -23 to 
60) [Homo sapiens] 


5e-13 


1372 


U47323 


Mus musculus 
stromal cell protein 
mRNA, complete cds 


3e-46 


1493833 


(U47323) stromal cell protein 
[Mus musculus] 


le-48 


1373 


AF059524 


Homo sapiens 
reticulon gene family 
protein 


7e-47 


1731169 


HYPUlHJbllCAL 113.1 KD 
PROTEIN T28D9.7 IN 
CHROMOSOME II >gi|861264 
(U28738) coded for by C. 
elegans cDNA yk8h5.3; coded 
for by C. elegans cDNA 
ykSh5.5; similar to C. elegans 
deg-1 and metr-4 in exon 2 
[Caenorhabdilis elegans] 


7.S 


1374 


AJ 132583 


Homo sapiens mRNA 
for puromycin 
sensitive 
aminopeptidase. 
partial 


3e-47 


1777519 


(U39123) T cell receptor beta 
chain [Homo sapiens] 


9.7 



^1 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1375 


M97856 


Homo sapiens histone 
binding protein 
mRNA. complete cds 


3e-47 


2645327 


(U83821) NADH 
dehydrogenase subunit 3 
(Oryzomys palustris] 


5.7 


1376 


U53220 


Human 

retinoblastoma- 
relatedRb2/pl30 
gene, 5" flanking 
region and partial cds 


3c-47 


2499225 


CMP-SIALIC ACID 
TRANSPORTER CMP-sialic 
acid transporter [Cricetulus 
griseus] 


5.3 


1377 


X87870 


H. sapiens mRNA for 
hepatocyte nuclear 
factor 4a 


le-47 


728832 


!!!! ALU SUBFAMILY SB 
WARNING ENTRY 


7.3 


1378 


AF060195 


Mus musculus 
proteasome regulator 
PA28 beta subunit 
gene, complete cds 


3e-48 


478681 


limb deformity protein - chicken 


0.25 


1379 


ABO 182 85 


Homo sapiens mRNA 
for KIAA0742 
protein, partial cds 


le-48 


3122969 


TESTIS SpfiClHC PROTEIN 
A (ZINC FINGER PROTEIN 
TSGA) >gi|281040|pir||S28499 
probable zinc finger protein - rat 
>gi|57504 (X59993) zinc finger 
protein 


le-30 


1380 


U35032 


Human endogenous 
retrovirus clone 
c5.11,HERV-H 
multiply spliced 
subgenomic leader, 
protease and integrase 
region mRNA, partial 
cds 


4e-49 


88558 


retroviral proteinase-like protein 
- human 


6e-05 


1381 


AB007956 


Homo sapiens 
mRNA, chromosome 
1 specific transcript 
KIAA0437 


le-49 


<NONE> 


<NONE> 


<NONE> 


1382 


D86987 


Homo sapiens mRNA 
forKIAA0214 
protein, complete cds 


le-49 


2497944 


ALPHA SCRUIN" >gl|633238 
(Z38132) scruin [Limulus 
polyphemus] 

>gi| 1093326|prfp 103269A 
scrulin [Limulus sp.J 


9.7 


1383 


U25826 


Human transcription 
factor (SC 1 ) gene, 
complete cds. 


4e-50 


<NONE> 


<NONE> 


<NO.\'E> 
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Nearest Neiehbor (BlastN vs. Cenbank) 


Nearest Neiehbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus ATP- 










1384 


U46690 


dependent RNA 
helicase mRNA, 
partial cds. 


4e-50 


1335873 


(U46690) ATP-dependent RNA 
helicase [Mus musculusl 


3e-24 


1385 


AF072128 


Mus musculus 
claudin-2 mRNA, 
complete cds 


2e-50 


3335184 


IAF072128) claudin-? TMik: 
musculus] 


4e-24 


1386 


AF093593 


Homo sapiens 
snRNA activating 
protein complex 
19kJDa subunit 
(SNAP 19) mRNA, 
complete cds 


le-50 


3668416 


(AF093593) snRNA activating 
protein complex 19kDa subunit 
[Homo sapiens] 


0.003 


1387 


U79745 


Homo sapiens 
monocarboxylate 
transporter 
homologue MCT6 
mRNA, complete cds 


Ie-50 


1 177607 


(X92485) pval [Plasmodium 




1388 


L09647 


Rattus norvegicus 
hepatocyte nuclear 
factor 3 a 


le-50 


404764 


(L10409) fork head related 
protein [Mus musculus] 


2e-2I 


1389 


X61506 


Mouse E46 mRNA 
for E46 protein 


4e-51 


114909 


BRAIN PROTEIN E46 


le-20 


1390 


M33387 


Human debrisoquine 
4-hydroxylase 
(CYP2DSP) and 


le-51 


126296 


LINE- 1 REVERSE 
TRANSCRIPTASE 
HOMOLOG protein 
[Nycticebus coucana] 


5e-I5 


1391 


AF019767 


Homo sapiens zinc 
Finger protein (ZPR1) 
mRNA, complete cds 


4e-52 


961507 


(D63788) anchor protein. LCM 


5.9 


1392 


Z37986 


EI. sapiens mRNA for 
shenylalkylamine 
binding protein. 


2e-52 


<NONE> 


<NONE> 


<NONE> 


1393 


U65416 


Human MHC class I 
molecule (MICB) 
lene, complete cds 


2e-52 


3878637 


(Z49128) weak similarity with 
SINR protein (Swiss Prot 
accession number P065 33); 
cDNA EST EMBL.T0O63 1 
;omes from this gene; cDNA 
EST yk293dl0.5 comes from 
his gene [Caenorhabditis 
:let;ans] 


8.7 



1^ 
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Nearest Neiahbor (BlastN vs. Genbank) 


Nearest Neiahbor (BlasiX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












beta-globin DNA-binding 




1394 


Z57547 


H.sapiens CpG DNA, 
clone 189a6. forward 
read cpgl89a6.ftla . 


2e-52 


111187 


protein Bl, transcription factor 
PU.l - mouse >gi|200586 
fM32370^ PIT 1 nrotein TMiiq 

musculus] >gi|200972 
(M38252) transcription factor 
Pu. 1 [Mus musculus] 


5.8 


1395 


L 13738 


Human activated 
p21cdc42Hs kinase 
(ack) mRNA. 






(AF037260) non-receptor 
protein tyrosine kinase Ack 
[Mus musculus] 


7e-23 


1396 


AF042379 


Homo sapiens spindle 
pole body protein 

sncQ7 hnmnlno CiCW 

mRNA, complete cds 


7e-53 


'2801701 


(./vru^zj/y; spinaie pole body 
protein spc97 homoloe GCP2 


le-16 


1397 


AF047441 


Homo sapiens RNA 
polymerase I 40kD 
subunit mRNA, 
complete cds 


6e-53 


3914807 


DNA-DlkkjTED ktf A 
POLYMERASE I 40 KD 
POLYPEPTIDE (RPA40) 
(RPA39) >gi|2266929 
(AF008442) RNA polymerase I 
subunit hRPA39 [Homo 
sapiens] 


4e-19 


1398 


AF1 04670 


Homo sapiens cell 
cycle protein 
(PA2G4) gene, exons 
6 through 13, and 
complete cds 


2e-53 


<NONE> 


<NONE> 


<NONE> 


1399 


S60754 


( VNTR locus DX24, 
tiypcrvariable tandem 
repeat cluster} 
[human. Genomic, 
2991 nt] > :: 
gb|L07935|HUMVNT 
J A Homo sapiens 
microsatellite VNTR 
DNA sequence. 


2e-53 


1209669 


(U38810)CAGR1 [Homo 
sapiens] >gi|3098420 
'AF040945) homeotic regulator 
homolog MAB2I [Mus 
musculus] 


4.6 


1400 


D86972 < 


■luman mRNA for 
KIAA021S gene, 
:omplete cds 


le-53 


3426041 


[AC005168) unknown protein 
Arabidopsis thaliana] 


9.1 
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Nearest Neighbor (BlastN vs. Genbank) 


J Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


1 DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1401 


AJ236682 


Homo sapiens 
chromosome 22 CpG 
island DNA. genomic 
Msel fragment, clone 
22CGIB49E6 . 
complete read 


7e-54 


392872 1 


(AL034355) putative 
cytochrome oxidase subunit I 

I \lrpr\tnm iiriao final Irv^l 




1402 


AJ236682 


Homo sapiens 
chromosome 22 CpG 
island DNA, genomic 
Msel fragment, clone 
22CGIB49E6 . 
complete read 


6e-54 


3928721 


(AL034355) putative 
cytochrome oxidase subunit I 
[Strepiomyces coelicolor] 


0.28 


1403 


M37583 


Human histone 
(H2A.Z) mRNA, 
complete cds. 


6e-54 


70711 


histone H2A.F. embryonic - 
chicken 


2e-16 


1404 


AJ009947 


Homo sapiens mRNA 
for putative ATPase. 
partial 


6c-54 


3550295 


f } putative rtir ase 
[Homo sapiens] 


3e-18 


1405 


Y08459 


B.taurus mRNA for 
novel cytoplasmic 
protein 


2e-54 


<NONE> 


<NONE> 


<NONE> 


1406 


AF042384 


Homo sapiens BC-2 
protein mRNA, 
complete cds 


2e-54 


2828147 


(AF042384) BC-2 protein 
(Homo sapiens] 


2e-14 


1407 


AF042379 


Homo sapiens spindle 
pole body protein 
spc97 homolog GCP2 
mRNA, complete cds 


8e-55 


2801701 


(AF042379) spindle pole body 
protein spc97 homoloa GCP2 


2e-17 


1408 


AF005355 


Oryctolagus; 
cuniculus translation 
nitiation factor 
:IF2C mRNA, 
complete cds 


7e-55 


3253159 


[AF005355) translation 
nitiation factor eIF2C 


3e-53 


1409 


] 

1 
1 

AF008442 < 


4omo sapiens RNA 
jolymerase 1 subunit 
1RPA39 mRNA, 
•omplete cds 


3e-55 


( 

3335138 


AF047441) RNA polymerase I 
lOkD subunit [Homo sapiens] 


3e-20 


1410 


1 

f 

s 

AF04744I c 


lomo sapiens RNA 
>olymerase I 40kD 
ubunit mRNA, 
omplete cds 


3e-55 


( 

3335138 A 


AF04744 1 ) RNA polymerase I 
>0kD subunit [Homo sapiens] 


3e-20 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Human mRNA for 










1411 


X08004 


Rap IB protein > :: 
emb|A08693|A08<593 
H.sapiens rap lb 
cDNA 


2e-55 


539995 


transforming protein rap lb - rat 
(strain Copenhagen) 


2e-18 


1412 


APO 10403 


Homo sapiens ALR 
mRNA. complete cds 


2e-55 


2358285 


(AF0 10403) ALR [Homo 
sapiens] 


le-49 


1413 


M77016 


Human tropomodulin 
mRNA, complete cds. 


8e-56 


262249 


(S52010) orfl 5' of EpoR [mice. 
Peptide. 85 aa] [Mus sp.] 


0.027 


1414 


AB02O633 


Homo sapiens mRNA 
for KIAA0826 
protein, partial cds 


2e-56 


<NONE> 


<NONE> 


<NONE> 


1415 


X87489 


H.sapiens genomic 
DNA (chromosome 
3: clone NL1243D) 


2e-56 


1814029 


(U8450 1) cuticle collagen 
[Caenorhabditis briggsae] 


0.038 


1416 


AB0O7893 


Homo sapiens 
KIAA0433 mRNA. 
partial cds 


2e-56 


2887437 


(AB007893) KIAA0433 (Homo 
sapiens] 


9e-2l 


1417 


X78925 


H.sapiens HZF2 
mRNA for zinc finger 
protein 


le-56 


3342002 


(AF054180) hematopoietic cell 
derived zinc finger protein 
[Homo sapiens] 


2e-21 


1418 


Z5628I 


H.sapiens mRNA for 
interferon regulatory 
factor 3 


9e-57 


2497442 


INTERFERON 
REGULATORY FACTOR 3 
factor 3 [Homo sapiens] 


2e-21 


1419 


U78772 


Homo sapiens nuclear 
VCP-like protein 
NVLp.l 


8e-57 


2406565 


(U68140) nuclear VCP-like 
protein NVLp.2 [Homo sapiens] 


5e-20 


1420 


D79994 


Human mRNA for 
KIAA0172 gene, 
partial cds 


3e-57 


1136404 


(D79994) similar to ankyrin of 
Chromatium vinosum. [Homo 
sapiens] 


9e-38 


1421 


AB002342 


Human mRNA for 
KIAA0344 gene, 
complete cds 


le-57 


2224629 


(AB002342) KIAA0344 [Homo 
sapiens] 


4e-20 


1422 


LI 9437 


Human transaldolase 
mRNA containing 
transposable element, 
complete cds 


le-57 


1553119 


(U63159) transaldolase [Mus 
musculus] 


2e-20 


1423 


D17532 


Human mRNA for 
RCK. complete cds 


9e-58 


129376 


PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P54 (ONCOGENE 
RCK) (DEAD BOX PROTEIN 
6) 


lc-10 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 
















1424 


X79568 


H.sapiens BDP1 
mRNA for protein- 
tyrosine-phosphatase 


9e-58 


1871531 


(X79568) protein-tyrosine- 
phosphatase 


le-22 


1425 


X79568 


H.sapiens BDPI 
mRNA for protein- 
tyrosinc~phosphatase 


9e-58 


1871S1I 

lo / IJJ 1 


(X79568) protein-tyrosine- 


9e-23 


1426 


ABO 12295 


Homo sapiens 
HKE1.5 mRNA for 
GDS-related protein, 
complete cds 


7e-58 


2648021 


(Z97184) RGL2 [Homo sapiens] 


9e-19 


1427 


AF086040 


Homo sapiens full 
length insert cDNA 
clone YX5?Ff)7 


1C-JO 




glutamine (Q)-rich factor 1, 
QRP-1 - mouse factor 1, QRF-1 
[mice, B-cell leukemia, BCL1, 
Peptide Partial, 84 aa] 


3e-36 


1428 


AB018195 


Homo sapiens ca xi 
mRNA for carbonic 
anhydrase- related 
protein XI. complete 
cds 


4e-59 


<NONE> 


<NONE> 


<NONE> 


1429 


AF071777 


Mus musculus ERE1 
(Ire 1) mRNA, 
complete cds 


4e-59 


3766209 


(AF071777) IRE1 [Mus 
musculus] 


7e-2S 


1430 


AB000462 


Homo sapiens mRNA 
for SH3 binding 
protein, complete cds, 
clone:RES4-23A 


3e-59 


<NONE> 


<NONE> 


<NONE> 


1431 


AF038172 


Homo sapiens clone 
23923 mRNA 
sequence 


3e-59 


3758855 


(Z98551)MAL3P6.11 
Plasmodium falciparum] 


1.3 


1432 


Z84812 


Human DNA 
sequence from phage 
)TEL from a contig 
from the tip of the 
short arm of 
:hromosome 16, 
•panning 2Mb of' 
6pl3.3 Contains 
SSTs 


le-59 


.■ 1 

400927 


R.IBONUCLEOPROTEIN 
^B97D ribonucleoprotein | 
Drosophila melanogaster] 


2.5 


1433 


I 
t 

[ 

s 

U36484 .s 


luman laminin- 
iinding protein gene, 
mrtial cds. and E2 
mall nucleolar RNA 
one, complete 
equence 


le-59 


226005 p 


rotein 40kJD [Mus musculus] 


7e-05 



403 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Reriunrt.int Proteins! 




SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 














DUAL SPECIFICITY " 






1434 


LI 1285 


iiuiii<jj»iipicni> c.r\. xs. 

activator kinase 
(MEK2) mRNA. 


i le-59 


2499630 


MITOGEN- ACTIVATED 
PROTEIN KINASE KINASE 2 
(MAP KINASE KINASE 2) 
(MAPKK 2) kinase type 2 
[Gallus gallus] 


3e-21 




1435 


AF086555 


nui IHJ iiipiCUS IUII 

length insert cDNA 
clone ZE14E04 


4e-60 


3287674 


(AC0O5239) F23149_l [Homo 
sapiens] 


2e-04 




1436 


M24766 


Human (clone 
ptiAiv^-izj aipna-_ 
collagen type IV 


4e-60 


29551 


(X0561O) alpha (2) chain 
[Homo sapiens] 


6e-15 




1437 


X65550 


H. sapiens mki67a 
mjxiN/\ \iong typej 
for antigen of 
monoclonal antibody 

Kl-O/ 


4e-60 


1170654 


ANTIGEN KI-67 
>gi|539555|pir||A48666 cell 
proliferation antigen Ki-67, long 
form - human Ki-67 [Homo 
sapiens] 


3e-15 


1438 


M27319 


Human calmodulin 
mRNA. complete cds. 


4e-60 


1345451 


(X05949) Calmodulin (AA 2 - 
59) (449 is 1st base in codon) 
Drosophila melanogaster] 


7e-20 


1439 


Y12781 


Homo sapiens mRNA 
for transducin (beta) 
like 1 protein 


3e-60 


62133 


(X06172) put. 134 kD protein 
(AA 1 - 1 187); put. replicasc 


7.4 


1440 


AB0O2383 


Human mRNA for 
KIAA0385 gene, 
complete cds 


le-60 . 


1001548 


(D64000) hypothetical protein 


4.4 


1441 


AF070614 


Homo sapiens clone 
24732 unknown 
mRNA, partial cds 


2e-6l 


3283879 


(AF070614) unknown [Homo 
sapiens] 


3e-17 


1442 


AB002326 


Human mRNA for 
KIAA0328 gene, 
partial cds 


6e-62 


547891 


MICROTUB ULE- 
ASSOCIATED PROTEIN 4 
microtubule-associated protein- 
Li [Bos taurus] 


5.6 


1443 


AF086471 


Homo sapiens full 
ength insert cDNA 
clone ZD88A01 


5e-62 


<NONE> 


<NONE> 


<NONE> 


1444 


I 
1 

AB002311 c 


-luman mRNA for \ 
OAA0313 gene, 
omplete cds | 


2e-62 


( 

( 
< 
c 
c 

2506357 c 


DIHYDROXYPHENYLPROPI 
ONATE 1.2-DIOXYGENASE 
>gi| 1657544 (U73857) similar 
o mcpl gene (catechol 2.3- 
iioxygenase) of A. eutrophus 3- 
2.3- 

lihydroxyphenylpropionate)!, 2- 
lioxygenase 2.3- 
lihydroxyphenylpropionate 1,2- 
lioxygenase 


3.4 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 



SEQ 
ID 



ACCESSION 



DESCRIPTION 



P VALUE 



ACCESSION 



DESCRIPTION 



P VALUE 



1445 



AF069737 



Xenopus laevis 
notchless (nle) 
mRNA. complete cds 



2e-62 



3687833 



(AF069737) notchless [Xenopus 
laevis] 



le-55 



1446 



AF044209 



Homo sapiens nuclear 
receptor co-repressor 
N-CoR mRNA. 
complete cds 



5e-63 



2137603 



nuclear receptor co-repressor N- 
CoR - mouse musculusj 
>gi|1583865|prfl|2 1 2 1436 A 
thyroid hormone receptor co- 
repressor [Mus musculus] 



2e-47 



1447 



M69238 



Human aryl 
hydrocarbon receptor 
nuclear translocator 
(ARNT) mRNA, 
complete cds. 



2e-63 



2702319 



(AF001307) aryl hydrocarbon 
receptor nuclear translocator; 
Arnl [Homo sapiensl 



5e-19 



1448 



XS0497 



H.sapiens PHKLA 
mRNA 



2e-63 



1170685 



PHUiFHUK V LAiL B 
KINASE ALPHA 
REGULATORY CHAIN 
LIVER ISOFORM 
(PHOSPHOR YLASE KINASE 
ALPHA L SUB UNIT) 
>gi|663010 (X80497) 
phosphorylase kinase 
phosphorylase kinase alpha 
subunit [Homo sapiens] 



5e-22 



1449 



AF03114I 



Homo sapiens 
ubiquitin conjugating 
enzyme 



2e-63 



2623260 



(ARB 1141) ubiquitin 
conjugating enzyme [Homo 
sapiens] 



le-23 



1450 



Z37166 



H.sapiens BAT1 
mRNA for nuclear 
RNA helicase 



6c-64 



2500529 



PROBABLE ATP- 
DEPENDENT RNA 
HELICASE P47 
>gi|2135840|pir||I37201 nuclear 
RNA helicase (DEAD family) 
BAT1 - human >gi|587146 
(Z37166) nuclear RNA helicase 
(DEAD family) [Homo sapiens] 



9e-24 



1451 



M64240 



i-iuman netix-loop- 
helix zipper protein 
(max) mRNA, 
complete cds. > :: 
gb|I4U38|I41I38 
Sequence 1 from 
patent US 5624818 > 

gb|I77062|I77062 
Sequence 1 from 
patent US 5693487 



5e-64 



88175 



Myc-binding factor Max, short 
form - human 



8e-22 
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Nearest 


Neighbor (BlastN vs. Genbank) 


Nearest Neiahbor(BlastX vs. Non-RcrfunHmi Pmrrino 


SEQ 
ID 


ACCESSION 


f DESCRIPTION 


P VALUE 


ACCESSION 


! DESCRIPTION 


P VALUE 












MOCOLLAtlEN-LVSLVEJ- 




1452 


M98252 


Homo sapiens lysyl 
hydroxylase (partial 
clone 2.2 Kb LH) 
RNA, complete 
mature peptide. 


ze-64 


400205 


OXOGL UT ARATE 5- 
DIOXYGENASE 
PRECURSOR (LYSYL 
HYDROXYLASE) lysyl 
hydroxylase [Homo sapiens] 


7e-22 


1453 


U09550 


Human oviductal 
glycoprotein mRNA, 
complete cds. 


8e-65 


2493676 


OVIDUCT-SPECIFIC 
GLYCOPROTEIN 
PRECURSOR (OVIDUCTAL 
GLYCOPROTEIN) 
(OVIDUCTIN) 


2e-ll . 


1454 


X67877 


R.norvegicus mRNA 
for cytosolic 
resiniferatoxin- 
binding protein 


7e-65 


423664 


resimteratoxin-bindmg protein 
RBP-26. cytosolic - rat 
>gi|3 1 1660 (X67877) cytosolic 
resiniferatoxin binding protein 
RBP-26 [Ratrus norvegicus] 
>gi| 1 093373 |prf]|2 1033 1 OA 
resiniferatoxin-binding protein 
[Rattus norveaicus] 


2e-40 


1455 


AB018254 


Homo sapiens mRNA 
forK!AA0711 
protein, complete cds 


6e-65 


92298 


glutamine/glutamic acid-rich 
protein 


0.98 


1456 


J036O7 


Human 40-kDa 
keratin intermediate 
filament precursor 
gene. 


3e-65 


1070608 


keratin 19, type I, cytoskeletal - 
human sapiens J 


4e-07 


1457 


U65896 


Human gamiru- 
glutamyl carboxylase 
gene, complete cds 


2e-65 


<NONE> 


<NONE> 


<NONE> 


1458 


( 

; 

U07681 i 


Human NAD(H)- 
>pecific isocitrate 
iehydrogenase alpha 
ubunit precursor 
nRNA. complete cds. 


2e-65 


( 
1 

j 

1708399 s 


IJsUUlKAIb - 
DEHYDROGENASE (NAD), 
MITOCHONDRIAL SUB UN IT 
ALPHA PRECURSOR 
[ISOCITRIC 

DEHYDROGENASE) (NAD+- 
SPECIFIC ICDH) 
Iehydrogenase alpha chain 
jrecursor - human >gi|706839 
ubunit precursor [Homo 
apiens] 


4e-'»6 


1459 


I 

f 

e 

U8S080 c 


-luman zinc finger 
>rotein (LD5-1) gene, 
xons 4, 5 and 6, and 
omplete cds 


2e-65 


( 

1373394 [ 


U57796) zinc finger protein 
Homo sapiens] >ai|2306773 


2e-39 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins') 


SEQ 
ID 


ACCESS IOI* 


t DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












tensin - chicken (fragment) 




1460 


M96625 


Gallus domesiicus 
tensin mRNA 
sequence. 


3e-66 


2134419 


>gi|63805 (Z18529) tensin 
[Gallus gallus] >gi|212755 
(L06662) tensin [Gallus gallus] 


le-5I 


1461 


U 13262 


Mus musculus myelin 
gene expression 
factor (MEF-2) 
mRNA. partial cds. 


! le-70 


536926 


(U13262) myelin gene 
expression factor [Mus 
musculus] 




1462 


U64033 


Mus musculus Tera 
(Tera) mRNA. 
complete cds 


5c-72 


1575505 


(U64033} Tera fMus mti«?ciiliisl 


9r-"?4 


1463 


X78989 


M. musculus mRNA 
for testin 


6e-74 


1351218 


TESTIN 2 (TES2) 
[CONTAINS: TESTIN 1 


8e-31 


1464 


U64033 


Mus musculus Tera 
(Tera) mRNA, 
complete cds 


2e-74 


1575505 


fU64033'i Tera [Mut mncrrilncl 




1465 


AF057365 


Canis familiaris UDP 
N-acetylglucosamine 
transporter mRNA, 
complete cds 


9e-79 


3298605 


(AF057365) UDP N- 
acetylglucosamine transporter 
Canis familiaris] 


9e-I0 


1466 


AJ0O6064 


Rattus norvegicus 
mRNA for coronin- 
like protein 


le-82 


3757680 


(AJ006064) coronin-like protein 
Rattus norveaicus] 


3e-62 


1467 


U91582 


Macaca fascicularis 
UDP- 

glucuronosyltransfera 
se mRNA, complete 
cds 


4e-89 


140396 


KARYOGAMY PROTEIN 
KAR4 yeast (Saccharomyces 
cerevisiae) 


le-08 


1468 


X06762 


Mouse Hox2.3 
mRNA 


3e-92 


123255 


HOMEOBOX PROTEIN HOX- 
B7 (HOX-2C) 


9e-23 


1469 


< 
i 
1 
1 

AB016930 c 


-ricetulus griseus 
TiRNA for 

3 hosphatidylglycerop 
losphate synthase, 
omplete cds 


5e-94 


( 
I 

4159682 s 


ABO 16930) 

'hosphatidy [glycerophosphate 
ynthase [Cricetulus griseus] 


7e-34 


1470 


I 

X74504 r 


A. musculus T10 
nRNA 


7e-97 


< 
1 

1711658 1 


JER/THR-RICH PROTEIN 
riO IN DGCR REGION 
.gi|480900|pir||S3748S gene 
'10 protein - mouse i 


3e-59 



WO 01/02568 
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se<; 

ID 


Nearest 

• 

access ior 


Neighbor (BlastN vs. 
4 DESCRIPTION 


Genbank) 
P VALUE 


Nearest Neieh 
ACCESSION 


bor (BlastX vs. Non-Redundant P 
DESCRIPTION 


roteins) 
P VALUE 


1471 


U13175 


Rattus nnrv^oir*ii« 

clone ubc 10a 
ubiquitin conjugating 
enzyme \czi /Kd) 
mRNA, complete cds 


3e-98 


1351345 


UBH^UI I UN-LUIVJ UUH 1 UNU 

"ElVZ'VMbE^-i; mr J 

^TTRTni riTirvr dd /^tctm 
(, uoiiju 1 1 LIN-rKU 1 cIIN 

LIGASE) (UBIQUITIN 
CARRIER PROTEIN) 
. (E2(17)KB 3) . 
>gi|1085588|pir||S53358 
ubiquitin conjugating enzyme 
(E217kB) - rat >gi|595666 
(U1J173) ubiquitin conjugating 
enzyme [Rattus norvegicus] 
norvegicus] >gi| 1145691 
(U39318) UbcH5C [Homo 
sapiens] 


5e-05 


1472 


S79873 


h-lamp-2=lysosome- 
associated membrane 

(LAMP2) mRNA, 
alternatively spliced 
form h-lamp-2b, 
complete cds. 


e-119 


<NONE> 


<NONE> 


<NONE> 


1473 


D 13623 


c\ ill y j i ivin / \ tor pj** 
protein, complete cds 


e-112 


480379 


ribosome-binding protein p34 - 
rat sp.] 


2e-05 


1474 


ABO 13357 


Mus musculus mRNA 
for 49 kDa zinc finger 
protein, complete cds 


e-136 


4153886 


(AB013357) 49 kDa zinc finger 
protein 


5e-08 


1475 


] 
\ 

ABO 16930 t 


Cricetulus griseus 
tlRNA for 

'hosphatidylglycerop 
losphate synthase, 
omplete cds 


e-117 


( 
I 

4159682 s 


AB0I6930) 

'hosphatidylglycerophosphate 
ynthase [Cricetulus griseus] 


4e-32 


1476 


I 

i 

2 
( 

U38253 n 


'attus norvegicus 
nitiation factor elF- 
B gamma subunit 
eEF-2B gamma) 
iRNA, complete cds 


e-103 


1 
f 
S 
E 

2494312 s 


"RANSLATION INITIATION 
"ACTOR EIF-2B GAMMA 
UBUNIT (EEF-2B GDP-GTP 
•XCHANGE FACTOR) 
ubunit [Rattus norvegicus] 


3e-42 
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Neares 


Neighbor (BlastN vs. Gcnbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins* 


SEC 
ID 


ACCESSIOl 


*J DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 


1477 


X73683 


R.norvegicus mRNA 
for histone H3.3 


e-117 


122075 


- (II3.3Q) liutmie 113.3 fiuiiflj 
(Drosophila melanogaster) 
histone H3.3B - chicken 
>gi|21 19023|pir|jS6I218 histone 
H3.3 - fruit fly (Drosophila 
hydei) 1-1361 lOrvctolaeus 
cuniculus] >°i|8046 (X53822) 
Histone H3.3Q gene product 

iDrosnnhiln mflinnoncfprl 

>gi|51 198 gallus] >gi|161 190 
(M17876) histone H3 [Spisula 
solidissima] >gi|2 11853 
(Ml 1393) histone 3.3 [Gallus 
gallus] >gi|306848 (Ml 1354) 
H3.3 histone [Homo sapiens] 
melanogaster] >gi|963031 
(X81205) histone H3.3 H3.3A 
variant [Drosophila 
melanogaster] musculus] 


le-45 


1478 


U32498 


Rattus norvegicus 
rscc8 mRNA, partial 
cds 


e-108 


2143962 


rsec8 - rat (fragment) 

>gi| 1019441 (U32498) rsecS 

[Rattus norvesicus] 


7e-48 


1479 


U41736 


Mus musculus ancient 
ubiquitous 46 kDa 
protein AUP1 
precursor (Aupl) 
mRNA, complete cds 


e-146 


1517822 


(U41736) ancient ubiquitous 46 
kDa protein ALTP46 precursor 
[Mus musculus] 


5e-49 


1480 


AF041338 


Bos taurus vacuolar 
proton pump subunit 
SFD alpha isoform 
(SFD) mRNA, ' 
:omplete cds 


e-I19 


2895578 


[AF04133S) vacuolar proton 
aump subunit SFD alpha 
soform [Bos taurus] 


3e-49 


1481 


] 
1 

AF064553 t 


VIus musculus NSD1 
jrotein mRNA, 
•omplete cds 


e-121 


( 

3329465 


AF064553) NSD1 protein 
Mus musculus] 


2e-50 


1482 


I 

< 

s 

AB000517 c 


iattus sp. mRNA for 
IDP-diacylglycerol 
ynthase, complete 
ds 


e-146 


( 
k 

1517822 r 


U41736) ancient ubiquitous 46 
Da protein AUP46 precursor 
vlus musculus] 


2e-5I 


1483 


I* 
C 

D38517 c 


4ouse mRNA for 
)hml protein, 
omplete cds 


e-118 


n 

2137562 n 


louse Dhml protein - mouse 
lusculus] 


6e-54 
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Nearest Neighbor (BlastN vs. Genbank) 



Nearest Neighbor (BlasiX vs. Non-Redundant Proteins) 



SEQ 
ED 



ACCESSION 



DESCRIPTION 



1484 1 X54352 



1485 1 U57692 



P VALUE 



M.domesticus MD6 



ACCESSION 



mRNA 



Mus musculus N- 
terminal asparagine 
amidohydrolase 
(Ntanl) mRNA, 
complete cds 



I486 1 X80169 



M. musculus mRNA 
for 200 kD protein 



DESCRIPTION 



e-139 



1085499 



CDC4 repeat unit-containing 



protein - mouse 



e-1 18 



2498797 



e-1 19 



1717793 



PROTEIN N-TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(PROTEIN NH2 -TERMINAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMIDASE) (PNAD) 
(PROTEIN NK2-TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE) 
(PNAA) >gi|1373365 (U57691) 
N-terminal asparagine 
amidohydrolase [Mus musculus] 
amidohydrolase [Mus musculus 



P VALUE 



le-55 



PROTEIN TSG24 (MEIOTIC 
CHECK POINT 
REGULATOR) 
>gi|1083553|pir||A35117 tsg24 



5e-57 



9e-58 



1487 1 U57692 



Mus musculus N- 
terminal asparagine 
amidohydrolase 
(Ntanl) mRNA, 
complete cds 



1488| U08215 



Mus musculus Hsp70 
related NST-1 (hsr.l) 
mRNA. complete cds 



e-1 20 



2498797 



PROTEIN N-TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE 
(PROTEIN NH2-TERMENAL 
ASPARAGINE DEAMIDASE) 
(NTN-AMIDASE) (PNAD) 
(PROTEIN NH2-TERMINAL 
ASPARAGINE 
AMIDOHYDROLASE) 
(PNAA) >gi|I373365 (U57691) 
N-terminal asparagine 
amidohydrolase [Mus musculus] 
amidohydrolase [Mus musculus] 



1489 1 D85926 



Mouse mRNA for 
Ray, complete cds 



1490 1 L20427 



Rattus norvegicus 

ihydroxypolyprenylb 
enzoate 

methyltransferase 
mRNA. complete cds 



1491 I X56044 



M. musculus mRNA 
for protein Htf9C 



e-109 



473407 



(U08215) NST-1 (Mus 
musculus] 



e-1 10 



1944389 



(D8S926) Ray [Mus musculus] 



8e-58 



7e-58 



e-123 



457372 



(L20427) 

hydroxypolyprenylbenzoate 
methyltransferase 

hydroxypolyprenylbenzoate 
methyltransferase [Rattus 
norvegicus) 



e-121 



3183977 



(X56044) protein Htf9C [Mus 
musculus] 



2c-58 



4e-59 



le-60 



f?0 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 












PROTO-ONCOGENE 




1492 


S74774 


p59fyn(T)=OKT3- 
induced calcium 
influx regulator 


e-163 


729896 


TYROS 1NE-PROTE1N 
KINASE FYN (P59-FYN) 
>gi|420217|pir||A4499l protein- 
tyrosinc kinase (EC 2.7.1.112) 
fyn - mouse 


8c-63 


1493 


U88873 


Mus musculus BUB2 
like protein 1 
(HBLP1) mRNA, 
complete cds 


e-123 


4099611 


(U88873) BUB2-like protein I 
[Mus musculus] 


le-63 


1494 


U48852 


Cricetulus griseus HT 
protein mRNA, 
complete cds. 


e-117 


1216486 


(U48852) HT protein 
[Cricetulus griseusl 


7e-64 


1495 


AF032667 


Rattus norvegicus 
rexo70 mRNA, 
complete cds 


e-142 


2827160 


(AF032667) rexo70 [Rattus 
norvegicus] 


5e-66 


1496 


M62722 


Chinese hamster 
phosphatidylserine 
decarboxylase 
mRNA. 3" end. 


e-114 


118910 


PHOSPHATIDYLSERINE 

DECARBOXYLASE 

PROENZYME 

>gi|109423|pir||A38732 

phosphatidylserine 

decarboxylase (EC 4.1.1.65) - 

Chinese hamster (fragment) 


2e-67 


1497 


AF072758 


Mus musculus fatty 
acid transport protein 
3 mRNA, partial cds 


e-130 


3335567 


(AF072758) fatty acid transport 
protein 3; FATP3 [Mus 
musculus] 


le-67 


1498 


AB005549 


Rattus norvegicus 
mRNA for atypical 
PKC specific binding 
protein, complete cds 


e-I13 


3868778 1 


(AB005549) atypical PKC 
specific binding protein [Rattus 
norvegicus] 


2e-69 


1499 


U57344 


Mus musculus 
homeobox protein 
Meis3 mRNA, 
complete cds 


e-143 


3024124 


HOMEOBOX PROTEIN 
MEIS3 


6e-72 


1500 


U09874 


Mus musculus SKD3 
mRNA, complete cds. 


e-142 


2493735 


SKD3 PROTEIN SKD3 [Mus 
musculus] 


le-72 


1501 


■ 

U72194 


vlus musculus 
muskelin mRNA, 
complete cds 


e-148 


3493462 


[U72194) muskelin [Mus 
musculus] 


2e-74 


1502 


XS0169 


VI. musculus mRNA 
r or 200 kD protein 


e-155 


( 

1 

1717793 : 


PROTEIN TSG24 (MEIOTIC 
2 HECK POINT 
REGULATOR) 
>gi|10S3553|pir||A55l 17 tss24 


3e-77 



HI I 
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Nearest Neighbor (BlastN vs. Genbank) 


Nearest Neighbor (BlastX vs. Non-Redundant Proteins) 


SEQ 
ID 


ACCESSION 


DESCRIPTION 


P VALUE 


ACCESSION 


DESCRIPTION 


P VALUE 






Mus musculus 










1503 


U72194 


muskelin mRNA, 
complete cds 


e-I54 


3493462 


(U72194) muskelin [Mus 
musculus] 


2e-78 


1504 


Y12836 


Cricetulus griseus 
mRNA for Zn finger 
factor 


e-146 


3150148 


(Y 12836) Zn finger factor 
[Cricetulus griseus] 


3e-83 
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Table 5 



SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


29 


295 


421 


5872 


For 


mkk like kinases 


30 


31 


182 


3943 


For 


Basic region plus leucine zipper 
transcription factors 


J 1 


Z.70 


397 


5625 


For 


mkk like kinases 


186 


175 


395 


7660 


For 


SH2 Domain 


187 


358 


432 


4320 


For 


Ank repeat 


196 


37 


322 


6049 


For 


mkk like kinases 


234 


23 


121 


4607 


For 


SH3 Domain 


308 


110 


172 


4150 


For 


Zinc finger, C2H2 type 


410 


42 


191 


4036 


For 


Basic region plus leucine zipper 
transcription factors 


431 


71 


428 


5538 


Rev 


ATPases Associated with Various 
Cellular Activities 


552 


116 


288 


3930 


Rev 


Basic region plus leucine zipper 
transcription factors 


639 


157 


561 


5797 


For 


ATPases Associated with Various 
Cellular Activities 


746 


209 


427 


5379 


For 


Fibronectin type III domain 


768 


116 


288 


3930 


For 


Basic region plus leucine zipper 
transcription factors 


807 


339 


392 


3620 


For 


Zinc finger, C2H2 type 


820 


341 


406 


2930 


Rev 


EF-hand 




1 08 
1 Uo 


zoz 


41 7Q 




Rasic region nlus leucine ziDDer 
transcription factors 


836 


158 


353 


4430 


For 


Basic region plus leucine zipper 
transcription factors 


1157 


41 


444 


5279 


Rev 


protein kinase 


1192 


186 


416 


5469 


For 


Fibronectin type III domain 


1268 


238 


315 


3540 


For 


Ank repeat 


1269 


79 


240 


11640 


For 


LIM domain containing proteins 


1288 


73 


234 


3953 


For 


Basic region plus leucine zipper 
transcription factors 



Y73 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1309 


248 


404 


8226 


for 


LIM domain containing proteins 


1324 


294 


356 


4690 


for 


Zinc finger, C2H2 type 


1325 


1 


234 


8981 


for 


C2 domain (prot. kinase C like) 


1336 


66 


164 


6390 


for 


WD domain, G-beta repeats 


1360 


222 


377 


8686 


for 


LIM domain containing proteins 


1365 


69 


257 


5221 


for 


Basic region plus leucine zipper 
transcription factors 


1380 


42 


140 


7130 


for 


WD domain, G-beta repeats 


1 J80 


741 
ZIO 


J70 


87"*fi 
o / Jyj 


fnr 


T WA dnmflin rr^ntainino nrntpins 




ZZZ 


ISO 


i n^st 


for 


TVvnctti 
1 1 y iJoiii 


1417 


o 
5 


1^4 


OU/J 


fnr 




1 A <A 


AQ 


zuy 




i or 


Rcj oi r* rc*ci\nr\ nine If l ir*inf* 7innf*r 

transcription factors 


1464 


4 


180 


4978 


for 


RNA recognition motif, (aka RRM, 
RBD, or RNP domain) 


1478 


54 


437 


5176 


for 


protein kinase 


1496 


241 


520 


3929 


for 


Helicases conserved C-terminal domain 


1 AQ£L 


A A 


/ci 7 
olz 


<; 1 87 

Jlo/ 


fnr 

ior 


piUlClIl ivlIlabC 


1503 


154 


216 


4870 


for 


Zinc finger, C2H2 type 


1514 


2 


252 


4662 


for 


RNA recognition motif, (aka RRM, 
RBD, or RNP domain) 


1527 


156 


212 


3520 


for 


Zinc finger, C2H2 type 


1538 


9 


635 


11087 


for 


wnt family of developmental signaling 
proteins 


1540 


289 


471 


4107 


for 


Basic region plus leucine zipper 
transcription factors 


1549 


200 


391 


4118 


for 


Basic region plus leucine zipper 
transcription factors 


1556 


163 


354 


3958 


for 


Basic region plus leucine zipper 
transcription factors 


1557 


207 


398 


4038 


for 


Basic region plus leucine zipper 
transcription factors 


1563 


107 


298 


3978 


for 


Basic region plus leucine zipper 
transcription factors 



M^4 
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Qtnrt 


Cfnn 




l~lirpf*tinn 


Oescrintion 


1 AT} 

lozz 


1 sn 






for 

1U1 


fta^ic reoion nlus leucine dinner 
transcription factors 


1630 


100 


291 


3998 


for 


Basic region plus leucine zipper 
transcription factors 


1674 


196 


258 


4880 


for 


Zinc finger, C2H2 type 


1676 


Q 

y 


SO 


aa i n 


ior 


PTotnf»nVinv Domain 


1677 


J 16 


joy 


C-70A 

J /o\) 


rev 


T ri t r\t"**rl r\~v i n c 

1 111U1 CUUAllld 


loos 


iuy 


A 1 A 

41U 




ior 


Dap Tsi m 1 1 V 
I\d5 ldllllljr 


1704 


184 


ill. 


jy// 


ior 


Dnclii rezrwifin nine l^lir»1t"l^ 7innpr 

J3asic region piuo leucine ^ippci 
transcription factors 


1707 


92 


439 


24100 


rev 


Phosphatidylinositol-specific 
phospholipase C, Y domain 


1711 


263 


361 


6400 


for 


WD domain, G-beta repeats 


1744 


238 


433 


10572 


rev 


Serine carboxypeptidases 


1755 


281 


367 


2580 


for 


EF-hand 


1762 


236 


334 


5880 


for 


WD domain, G-beta repeats 


1779 


64 


126 


4790 


for 


Zinc finger, C2H2 type 


1801 


295 


351 


4030 


for 


Zinc finger, C2H2 type 


1804 


301 


378 


3460 


for 


Ank repeat 


1808 


36 


161 


4170 


for 


Basic region plus leucine zipper 

ITanSCripilOn laClOi:* 


1 O A \ 

1 8 1 1 


184 


315 


07QA 


ior 


"XT tarm i'nal V»r*tv*rtlrvfT\/ in T-*"tc Hnmain 
lN-ici jinricn iiuiiiunjgy in ljij uuui<xiii 


1 O 1 /I 

1814 


127 


zy4 


1 u / /u 


fu- 
ror 


DrUIIlOLUJIIldlll IL-iJIIod VtU atLJUfcll^t- 

found in human, Drosophila and yeast 
proteins.) 


1818 


9 


146 


4741 


for 


Double-stranded RNA binding motif 


1819 


278 


355 


3460 


for 


Ank repeat 


1820 


123 


299 


12150 


for 


Homeobox Domain 


1821 


127 


303 


12180 


for 


Homeobox Domain 


1830 


184 


267 


4270 


for 


Ank repeat 


1832 


18 


173 


8987 


for 


SH3 Domain 


1835 


51 


206 


8987 


for 


SH3 Domain 


1839 


224 


307 


4270 


for 


Ank repeat 


1846 


12 


398 


36700 


for 


G-protein alpha subunit 
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SEQ ID 


Start 


Stop 


Score 


Direction 


Description 


1909 


160 


258 


6370 


tor 


WD domain, G-beta repeats 


191 1 


35 


151 


9335 


tor 


Zinc linger, l—JrU-.'f type (KINO linger) 


1980 


60 


197 


7917 


tor 


Zinc linger, C3HC4 type (.KlMu tinger) 


2065 


253 


306 


5410 


for 


Zinc ringer, CCHC class 


2135 


2 


401 


10596 


for 


ATPases Associated with Various 
Cellular Activities 


2216 


90 


179 


5380 


for 


WW/rsp5/WWP domain containing 
proteins 


2218 


127 


225 


5500 


for 


WD domain, G-beta repeats 


2281 


20 


387 


6044 


for 


Protein Tyrosine Phosphatase 


2282 


183 


353 


5136 


for 


C2 domain (prot. kinase C like) 


2286 


12 


382 


5228 


for 


protein kinase 


2310 


20 


371 


5962 


for 


Protein Tyrosine Phosphatase 


2363 


48 


211 


4132 


for 


Basic region plus leucine zipper 
transcription factors 


2424 


43 


194 


3996 


for 


Basic region plus leucine zipper 
transcription factors 


2428 


25 


350 


4675 


for 


Dual specificity phosphatase, catalytic 
domain 


2562 


18 


101 


4560 


for 


Ank repeat 


2577 


0 


311 


10295 


for 


4 transmembrane segments integral 
membrane proteins 


2591 


60 


165 


4560 


for 


SH2 Domain 


2684 


9 


461 


5759 


for 


ATPases Associated with Various 
Cellular Activities 


2826 


116 


400 


16107 


for 


DEAD and DEAH box helicases 


2859 


100 


320 


5550 


rev 


ATPases Associated with Various 
Cellular Activities 


2871 


198 


392 


9384 


for 


DEAD and DEAH box helicases 


2944 


18 


281 


10480 


for 


Calpain large subunit, domain III 


2969 


5 


387 


5976 


rev 


protein kinase 


3015 


131 


214 


3600 


for 


Ank repeat 


3047 


191 


292 


5295 


for 


WD domain, G-beta repeats 


3081 


190 


252 


4360 


for 


Zinc ringer, C2H2 type 


3108 


275 


36/ 




tor 


wu domain, u-oeia repeals 


3147 


190 


369 


4022 


for 


Basic region plus leucine zipper 
transcription factors 


3152 


129 


320 


3947 


for 


Basic region plus leucine zipper 
transcription factors 


3158 


167 


334 


4180 


for 


Basic region plus leucine zipper 
transcription factors 


3175 


14 


164 


5951 


for 


mkk like kinases 
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cpo TO 


Start 


Stnn 


Score 


T")irpction 


Descriotion 


3175 


o 
* 


1 lz 




ior 


protein Kinase 


3178 


45 


386 


19398 


for 


ATPases Associated with Various 
i^enuiar /\ciiviiics 


3183 


14 


O 1 C 

215 




tor 


4 transmernorane segments integral 
membrane proteins 


3190 


229 


390 


6089 


tor 


mkk like kinases 


3190 


118 


390 


8063 


lor 


protein kinase 


3193 


293 


355 


3570 


for 


Zinc finger, C2H2 type 


3195 


0 


215 


10146 


for 


4 transmembrane segments integral 
membrane proteins 


3197 


281 


343 


4490 


for 


Zinc finger, C2H2 type 


3208 


34 


256 


4190 


for 


Basic region plus leucine zipper 
transcription factors 


3258 


138 


394 


9877 


for 


Ras family 


3266 


8 


139 


9328 


for 


ATPases Associated with Various 
Cellular Activities 


3267 


97 


180 


3820 


for 


Ank repeat 


3274 


11 


187 


15442 


tor 


Fork head domain, eukaryotic 
transcription factors 


3281 


15 


182 


9681 


r 

LOT 


mkk like kinases 


J/BJ 


1 6. 
ID 


1 fYJ 

1 uz 


HOou 


fnr 

ior 


CjV HcUlll 


3292 


208 


300 


5585 


for 


WD domain, G-beta repeats 


3297 


7 


153 


6100 


for 


Helicases conserved C-terminal domain 


3306 


161 


223 


4900 


L for 


Zinc finger, C2H2 type 


3307 


43 


321 


8740 


for 


SH2 Domain 


3339 


94 


342 


14970 


for 


SH2 Domain 


3345 


65 


271 


12512 


for 


PDZ domain 


3351 


124 


270 


6068 


for 


Phorbol esters/diacylglycerol binding 
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Example 4 

Differential Expression of Polynucleotides of the Invention: 
Description of Libraries and Detection of Differential Expression 

5 The relative expression levels of the polynucleotides of the invention 

was assessed in several libraries prepared from various sources, including cell lines and 
patient tissue samples. Table 6 provides a summary of these libraries, including the 
shortened library name (used hereafter), the mRNA source used to prepare the cDNA 
library, the abbreviated name of the library that is used in the tables below (in quotes), 
10 and the approximate number of clones in the library. 



Table 6 

Description of cDNA Libraries 



Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


1 


Kml2L4 

Human Colon Cell Line, High Metastatic Potential 
(derived from Km 1 2C) 
"High Colon" 


307133 


2 


Kml2C 

Human Colon Cell Line, Low Metastatic Potential 
"Low Colon" 


284755 


3 


MDA-MB-231 

Human Breast Cancer Cell Line, High Metastatic Potential; 
micro-metastases in lung 
"High Breast" 


326937 


4 


MCF7 

Human Breast Cancer Cell, Non Metastatic 
"Low Breast" 


318979 


8 


MV-522 

Human Lung Cancer Cell Line, High Metastatic Potential 
"High Lung" 


223620 


9 


UCP-3 

Human Lung Cancer Cell Line, Low Metastatic Potential 
"Low Lung" 


312503 
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Library 
(lib #) 


Description 


Number of 
Clones in 

this 
Clustering 


12 


Human microvascular endothelial cells (HMEC) - Untreated 
PCR (OligodT) cDNA library 


41938 


13 


Human microvascular endothelial cells (HMEC) - 
Basic fibroblast growth factor (bFGF) treated 
PCR (OligodT) cDNA library 


42100 


14 


Human microvascular endothelial cells (HMEC) - 
Vascular endothelial growth factor (VEGF) treated 
PCR (OligodT) cDNA library 


42825 


15 


Normal Colon - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


34285 


16 


Colon Tumor - UC#2 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


35625 


17 


Liver Metastasis from Colon Tumor of UC#2 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


36984 


18 


Normal Colon - UC#3 Patient 
PCR (OligodT) cDNA library 
"Normal Colon Tumor Tissue" 


36216 


19 


Colon Tumor - UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Tumor Tissue" 


41388 


20 


Liver Metastasis from Colon Tumor of UC#3 Patient 
PCR (OligodT) cDNA library 
"High Colon Metastasis Tissue" 


30956 


21 


GRRpz 

Human Prostate Cell Line 


164801 


22 


WOca 

Human Prostate Cancer Cell Line 


162088 



The KM12L4 and KM12C cell lines are described in Example 1 above. 
The MDA-MB-23 1 cell line was originally isolated from pleural effusions (Cailleau, J. 
Natl. Cancer. Inst. (1974) 53:661), is of high metastatic potential, and forms poorly 
5 differentiated adenocarcinoma grade II in nude mice consistent with breast carcinoma. 
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. The MCF7 cell line was derived from a pleural effusion of a breast adenocarcinoma and 
is non-metastatic. The MV-522 cell line is derived from a human lung carcinoma and is 
of high metastatic potential. The UCP-3 cell line is a low metastatic human lung 
carcinoma cell line; the MV-522 is a high metastatic variant of UCP-3. These cell lines 
5 are well-recognized in the art as models for the study of human breast and lung cancer 
(see, e.g., Chandrasekaran et al.,. Cancer Res. (1979) 39:870 (MDA-MB-231 and MCF- 
7); Gastpar et al., J Med Chem (1998) 47:4965 (MDA-MB-231 and MCF-7); Ranson et 
al., Br J Cancer (1998) 77:1586 (MDA-MB-231 and MCF-7); Kuang et al., Nucleic 
Acids Res (1998) 2<5:1116 (MDA-MB-231 and MCF-7); Varki et al., Int J Cancer 

10 (1987) 40:46 (UCP-3); Varki et al., Tumour Biol. (1990) 77:327; (MV-522 and UCP-3); 
Varki et al., Anticancer Res. (1990) 70:637; (MV-522); Kelner et al., Anticancer Res 
(1995) 75:867 (MV-522); and Zhang et al., Anticancer Drugs (1997) 8:696 (MV522)). 
The samples of libraries 15-20 are derived from two different patients (UC#2, and 
UC#3). The bFGF-treated HMEC were prepared by incubation with bFGF at lOng/ml 

15 for 2 hrs; the VEGF-treated HMEC were prepared by incubation with 20ng/ml VEGF 
for 2 hrs. Following incubation with the respective growth factor, the cells were 
washed and lysis buffer added for RNA preparation. The GRRpz cell line refers to low 
passage (3 passages or fewer) human prostate cells, and the WOca cell line refers to low 
passage (3 passages or fewer) human prostate cancer cells. 

20 Each of the libraries is composed of a collection of cDNA clones that in 

turn are representative of the mRNAs expressed in the indicated mRNA source. In 
order to facilitate the analysis of the millions of sequences in each library, the sequences 
were assigned to clusters. The concept of "cluster of clones" is derived from a 
sorting/grouping of cDNA clones based on their hybridization pattern to a panel of 

25 roughly 300 7bp oligonucleotide probes (see Drmanac et al., Genomics (1996) 
57(1):29). Random cDNA clones from a tissue library are hybridized at moderate 
stringency to 300 7bp oligonucleotides. Each oligonucleotide has some measure of 
specific hybridization to that specific clone. The combination of 300 of these measures 
of hybridization for 300 probes equals the "hybridization signature" for a specific clone. 

30 Clones with similar sequence will have similar hybridization signatures. By developing 
a sorting/grouping algorithm to analyze these signatures, groups of clones in a library 
can be identified and brought together computationally. These groups of clones are 
termed "clusters". Depending on the stringency of the selection in the algorithm 
(similar to the stringency of hybridization in a classic library cDNA screening protocol), 

35 the "purity" of each cluster can be controlled. For example, artifacts of clustering may 

#6 6 



WO 01/02568 



PCT/US00/18374 



occur in computational clustering just as artifacts can occur in "wet-lab" screening of a 
cDNA library with 400 bp cDNA fragments, at even the highest stringency. The 
stringency used in the implementation of cluster herein provides groups of clones that 
are in general from the same cDNA or closely related cDNAs. Closely related clones 
5 can be a result of different length clones of the same cDNA, closely related clones from 
highly related gene families, or splice variants of the same cDNA. 

Differential expression for a selected cluster was assessed by first 
determining the number of cDNA clones corresponding to the selected cluster in the 
first library (Clones in 1 st ), and the determining the number of cDNA clones 

10 corresponding to the selected cluster in the second library (Clones in 2 nd ). Differential 
expression of the selected cluster in the first library relative to the second library is 
expressed as a "ratio" of percent expression between the two libraries. In general, the 
"ratio" is calculated by: 1) calculating the percent expression of the selected cluster in 
the first library by dividing the number of clones corresponding to a selected cluster in 

15 the first library by the total number of clones analyzed from the first library; 
2) calculating the percent expression of the selected cluster in the second library by 
dividing the number of clones corresponding to a selected cluster in a second library by 
the total number of clones analyzed from the second library; 3) dividing the calculated 
percent expression from the first library by the calculated percent expression from the 

20 second library. If the "number of clones" corresponding to a selected cluster in a library 
is zero, the value is set at 1 to aid in calculation. The formula used in calculating the 
ratio takes into account the "depth" of each of the libraries being compared, i.e., the 
total number of clones analyzed in each library. 

In general, a polynucleotide is said to be significantly differentially 

25 expressed between two samples when the ratio value is greater than at least about 2, 
preferably greater than at least about 3, more preferably greater than at least about 5 , 
where the ratio value is calculated using the method described above. The significance 
of differential expression is determined using a z score test (Zar, Biostatistical Analysis. 
Prentice Hall, Inc., USA, "Differences between Proportions," pp 296-298 (1974)). 
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EXAMPLE 5 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Breast Cancer Cells Versus Low Metastatic Breast Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential breast 
cancer tissue and low metastatic breast cancer cells. Expression of these sequences in 
breast cancer can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

15 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential breast cancer cells and low metastatic 
potential breast cancer cells. 

Table 7 

25 Differentially expressed polynucleotides: Higher expression in 

high metastatic potential breast cancer (lib3) relative to low metastatic 
breast cancer cells (lib4) 



SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Hb3/lib4 


472 


64 


0 


62 


1851 


6 


0 


6 


1856 


8 


0 


8 


1867 


6 


0 


6 


1872 


6 


0 


6 


1875 


12 


3 


4 


1923 


89 


22 


4 
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CCA in MAr, . 

ohKJ ID JNUS. 


Lib3 clones 


LiD^f clones 




zl lo 


i 


A 

u 


7 


21 19 


1 


U 


7 


2135 


in 

5 1 


1 J 


■i 


2190 


1 c\ 


u 


1 O 


2193 


16 


c 

J 


"1 


2232 


12 


I 


O 


2239 


6 


U 


O 


2338 


21 


Z 




2378 


16 


4 


A 

4 


2394 


6 


0 


zr 
O 


2395 


6 


0 


iT 

o 


2490 


13 


-> 

3 


4 


2505 


16 


2 




2540 


8 


l 


o 
a 


2542 


1 1 


l 


1 1 


2607 


1 1 


2 


5 


2640 


22 


5 


4 


2674 


8 


0 


o 

8 


2679 


19 


0 


19 


2684 


14 


4 


3 


2707 


8 


0 


8 


2724 


9 


0 


9 


2757 


6 


0 


o 


2776 


10 


0 


10 


2804 


13 


2 


6 


2818 


✓* 

6 


0 


/: 
o 


2906 


1 A 

14 


0 


14 


2959 


26 


o 
8 


3 


2964 


17 


A 

4 


4 


2968 


6 


U 


o 


2977 


22 


3 


1 


2980 


13 


i 
1 


1 1 


3010 


O 


U 


o 




1 n 


i 


1 0 


3071 


33 


12 


3 


3072 


9 


1 


9 


3095 


19 


3 


6 


3097 


11 


2 


5 


3173 


12 


2 


6 


3203 


8 


1 


8 


3210 


27 


8 


3 



WO 01/02568 



PCT/US00/18374 



SEQ ID NOs: 


Lib3 clones 


Lib4 clones 


Iib3/lib4 


3212 


13 


1 


13 


3284 


8 


0 


8 


3288 


6 


0 


6 


3331 


14 


3 


5 


3335 


13 


1 


13 



Table 8 

Differentially expressed polynucleotides: Higher expression in 
low metastatic breast cancer cells (lib4) relative to high metastatic 
5 potential breast cancer (lib3) 



SEQ ID NOs: 


Lib 3 Clones 


Lib 4 Clones 


Iib4/lib3 


402 


0 


6 


6 


614 


3 


21 


7 


624 


0 


6 


6 


626 


0 


8 


8 


712 


0 


9 


9 


744 


0 


7 


7 


1325 


2 


29 


15 


1452 


2 


13 


7 


1880 


0 


9 


9 


1915 


0 


7 


7 


1951 


0 


6 


6 


1955 


8 


32 


4 


2015 


0 


7 


7 


2046 


0 


7 


7 


2076 


1 


22 


23 


2087 


0 


6 


6 


2124 


0 


9 


9 


2145 


0 


8 


8 


2162 


0 


6 


6 


2163 


0 


12 


12 


2164 


5 


19 


4 


2172 


2 


15 


8 


2192 


5. 


16 


3 


2244 


20 


43 


2 


2266 


3 


18 


6 


2313 


24 


56 


2 


2346 


1 


13 


13 



4^ 
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SFO ID NOs- 


T ,ih ^ Clones 


Lib 4 Clones 


Iib4/lib3 


2355 


o 


10 


10 


9371 


o 


6 


6 


93Q3 


1 


17 


17 


9404 


1 


21 


22 


944.3 


n 


6 




Z*rOU 


n 
\j 


1 1 

1 1 


1 ] 


9^9"* 






6 


9<\9"^ 
ZJ / J 


1 

1 


in 


10 


9^78 
/ o 


n 






9^ 9. A. 


1 

1 


1 7 

1 / 


17 




KJ 






7AnQ 

zouy 


i 
i 




Q 


Z.OJZ 


D 


94 




2.1 14 


c 

J 


94 
ZH 






U 


D 


o 


Z / JZ 


1 
1 


1 d. 
1 *f 


1 A 


z /y4 


4 


1 « 
1 J 


*+ 


zozo 


U 


7 


7 


zyo / 


J 


1 c 
I J 






1 
1 


1 4 


1 A 
1*4 




on 
zu 


JO 


*l 


-JU4 / 


,1 


1 7 




in^7 
JUj / 


7 

z 


1 7 


Q 


JU/J 


7 
z 


1 1 
1 1 


f. 


JU /o 


u 


/£ 

o 


O 


J 1UZ 


u 




f. 
\J 


3128 


15 


52 


4 


3132 


15 


52 


4 


3142 


0 


6 


6 


3187 


22 


49 


2 


3253 


23 


96 


4 


3282 


19 


46 


2 


3285 


20 


40 


2 


3346 


0 


9 


9 



^5 
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EXAMPLE 6 

Polynucleotides Differentially Expressed in High Metastatic Potential Lung 
Cancer Cells Versus Low Metastatic Lung Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential lung 
cancer cells and low metastatic lung cancer cells. Expression of these sequences in lung 
cancer tissue can be valuable in determining diagnostic, prognostic and/or treatment 
information. For example, sequences that are highly expressed in the high metastatic 

10 potential cells can be indicative of increased expression of genes or regulatory 
sequences involved in the metastatic process. A patient sample displaying an increased 
level of one or more of these polynucleotides may thus warrant more aggressive 
treatment. In another example, sequences that display higher expression in the low 
metastatic potential cells can be associated with genes or regulatory sequences that 

1 5 inhibit metastasis, and thus the expression of these polynucleotides in a sample may 
warrant a more positive prognosis than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential lung cancer cells and low metastatic 
potential lung cancer cells: 
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Table 9 

Differentially expressed polynucleotides: Higher expression in high 
metastatic potential lung cancer cells (lib8) relative to low 
metastatic lung cancer cells (lib9) 



SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


14 


10 


0 


10 


137 


5 


0 


5 


151 


5 


0 


7 


152 


9 


0 


13 


171 


6 


0 


8 


200 


10 


0 


14 


254 


5 


0 


7 




5 


0 


7 


01\ 
Z> / 1 


5 


0 


7 




6 


1 


8 


412 


5 


0 


7 


507 


5 


0 


7 


520 


6 


0 


8 


530 


5 


0 


7 


588 


5 


0 


7 


623 


7 


0 


10 


637 


7 


0 


10 


660 


5 


0 


7 


678 


8 


0 


11 


680 


5 


0 


7 


700 


9 


2 


6 


714 


28 


13 


3 


774 


11 


0 


15 


812 


5 


0 


7 


834 


8 


2 


6 


901 


11 


2 


8 


1168 


5 


0 


7 


1333 


6 


0 


8 


1352 


5 


0 


7 


1524 


11 


1 


15 


1706 


5 


0 


7 


1752 


17 


9 


3 


1768 


20 


4 


7 


1769 


5 


0 


7 


1780 


6 


0 


8 H 
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cpn ir> mo- 


LllUO Clones 


1_*1D7 clUIlCb 




1 78. 1 
1/51 


40, 


•5 
J 


1 Q 


1 7QQ 

1 /yy 


o 


1 
1 


Q 

o 


1 

15UJ 


o 


1 
1 


o 
o 


1011 

151 1 


1 O 




7 

Z 


1 QOA 


z: 
O 


ft 


o 
o 


iyiy 


o 
O 


i 
1 


1 1 
1 1 




O 




o 
o 


1975 


4J 




1 


OAT A 

2U24 


1 o 

lz 


1 
i 


1 / 


2045 


8 


1 


1 i 
1 1 


2060 


'OA 

20 


1 i 


-> 
Z 


2071 


i a 
16 


A 

4 


D 


2128 


5 


0 


/ 


2177 


1 A 

10 


Z 


/ 


2181 


A A 

44 


1 "5 

13 


5 


21 84 


1 1 


1 


15 


° 2185 


10 


A 

4 


J 


2283 


7 


0 


1 A 
10 


231 1 


10 


A 

4 


3 


2314 


10 


0 


1 A 

14 


2393 


14 


6 


3 


2398 


6 


1 


O 

o 


2460 


10 


4 


J 


1 C 1 A 

2514 


o 


A 
0 


Q 

5 


2597 


5 


A 
0 


1 


ZT C 

2657 


o 

5 


z 


O 


2669 


6 


1 

1 


Q 

o 


2670 


6 


1 
1 


o 
5 


304 / 


1 1 
z 1 


1 
J 


1 ft 
1 U 


3050 


1 A 


c 
3 




1AOO 

juyz 


"7 


1 
I 


1 ft 


3 14U 


1 fi 1 
151 


1 1 Q 
1 1 7 


z 


1 1 ^7 
J 1 J / 


c 
J 


A 

u 


7 


"^1 87 


1 6 




4 


3210 


5 


0 


7 


3220 


28 


4 


10 


3236 


7 


1 


10 


3249 


16 


0 


22 


3264 


8 


2 


6 


3305 


7 


0 


10 


3309 


20 


0 


28 
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SEQ ID NO: 


Lib8 clones 


Lib9 clones 


Iib8/lib9 


3318 


24 


4 


8 


3330 


5 


0 


7 


3331 


5 


0 


7 



Table 10 

Differentially expressed polynucleotides: Higher expression in low metastatic lung 
cancer cells (lib 9) relative to high metastatic potential lung cancer cells (lib 8) 



OPA TT"\ X T f~\ . 

SEQ ID NO: 


Lib o clones 


Lib y clones 


1:1. Q/i;u Q 

lib y/iio a 


24 


3 


20 


c 

J 


53 


0 


1 O 


1 1 
1 3 


64 


0 


o 

8 


o 


70 


0 


1 1 


o 
5 


105 


10 


66 


c 

J 


129 


0 


16 


1 1 

1 1 


214 


1 


1 yl 
14 


i n 
1U 


233 


4 


JJ 


O 


2 j / 


f\ 
\J 


1 1 
1 J 


Q 


Z04 


r» 

u 


ZV 


1 1 




z 


1 7 
1 / 


o 




1 

1 


37 


26 


370 


0 


11 


8 


418 


0 


8 


6 


450 


0 


L 9 


6 


461 


0 


9 


6 


484 


0 


26 


19 


494 


0 


41 


29 


517 


1 


12 


9 


522 


1 


11 


8 


581 


1 


17 


12 


614 


3 


23 


5 


706 


0 


11 


8 


726 


5 


23 


3 


806 


0 


14 


10 


824 


0 


9 


6 


836 


1 


14 


10 


874 


0 


12 


9 


900 


5 


21 


3 


1017 


2 


14 


5 
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SEQ ID NO: 


Lib 0 clones 


Lib 9 clones 


no y/iib 0 


11/1/1 

1 144 


0 


5 


0 


1 154 


0 


12 




1 loo 


2 


A C 

45 


1 /C 

lo 


1 170 


l 


13 


9 


1302 


2 


13 


5 


1326 


l 


13 


9 


1327 


l 


13 


9 


1367 


0 


12 


9 


1377 


0 


12 


9 


1437 


2 


18 


6 


1442 


l 


14 


10 


1466 


0 


13 


9 


1476 


0 


13 


9 


1495 


0 


8 


6 


1496 


l 


13 


9 


1664 


38 


253 


5 


1682 


l 


17 


12 


1687 


0 


9 


6 


1758 


0 


8 


6 


1817 


4 


18 


3 


1837 


3 


16 


4 


1845 


3 


23 


5 


1856 


2 


17 


6 


1910 


l 


18 


13 


2146 


2 


16 


9 


2156 


0 


9 


6 


2463 


0 


12 


9 


2724 


10 


38 


3 


2749 


403 


2000 


4 


2801 


6 


25 


3 


2993 


3 


18 


4 


3080 


0 


10 


7 


3107 


3 


23 


5 


3292 


0 


20 


14 


3324 


110 


548 


4 



f Ho 
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EXAMPLE 7 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Cells Versus Low Metastatic Colon Cancer Cells 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high metastatic potential colon 
cancer cells and low metastatic colon cancer cells. Expression of these sequences in 
colon cancer tissue can provide diagnostic, prognostic and/or treatment information. 
For example, sequences that are highly expressed in the high metastatic potential cells 

10 can be indicative of increased expression of genes or regulatory sequences involved in 
the metastatic process. A patient sample displaying an increased level of one or more of 
these polynucleotides may thus warrant more aggressive treatment. In another example, 
sequences that display higher expression in the low metastatic potential cells can be 
associated with genes or regulatory sequences that inhibit metastasis, and thus the 

15 expression of these polynucleotides in a sample may warrant a more positive prognosis 
than the gross pathology would suggest. 

The differential expression of these polynucleotides can be used as a 
diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 

20 known molecular and/or biochemical markers. 

The following table summarizes identified polynucleotides with 
differential expression between high metastatic potential colon cancer cells and low 
metastatic potential colon cancer cells: 

Table 1 1 

25 Differentially expressed polynucleotides: Higher expression in low metastatic colon 
cancer cells (lib 2) relative to high metastatic potential colon cancer cells (lib 1) 



SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


429 


0 


9 


10 


1494 


0 


8 


9 


1923 


34 


114 


4 


1986 


3 


12 


4 


2018 


0 


9 


10 


2036 


2 


10 


5 


2049 


8 


25 


3 


2135 


24 


87 


4 
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T ih 1 r*lonps 


T .iV> 9 clones 


lib 2/lib 1 


9146 


2 


16 


9 


9908 




27 


5 


991 S 
ZZ 1 J 


9 


1 1 


6 


997Q 


1 
■ 


10 


1 1 


Z JU / 


z 


19 




ZJ 1 J 


98 
Zo 


f\~> 
DZ 


9 


ZJ J / 




1 4 


3 


ZJOU 




91 
Z 1 


8 

o 


Z30z 


a 
u 






ZJ /o 


•3 
J 


1 9 
1 z 


4 1 


zjoy 


J 


90, 


7 


ZJ / 1 


u 


O 


e. 


nrnfl 
ZOOO 


->4 


1 79 
1 /Z 


1 




1 c 
1 5 


4 i 


j 


OiC 1 1 

261 1 


a 
U 


O 


D 


2636 


A 
U 


Q 

y 


i n 
1 u 


2641 


/ 


ZU 




2651) 


A 


o 
y 


i ft 


2662 


A 
U 


Q 

y 


i ft 


2674 


4 






2682 


A 

u 


o 


o 


Z /UZ 


y 


ZD 


jy 


z /U4 


o 
o 


Z J 




Z / 1 0 


O 

z 


1 9 ( 
1 Z \ 


u 


Z8U4 


Q 

y 


99 
ZZ 


-J 


zoz 1 


i i 
i j 


. 9Q 

Z7 


0 

z. 


oo/t a 
2o4U 


i 


o 
o 


Q 


Z546 


Z 


1 c 
1 J 


0 
o 


ZOOO 


A 
U 


f. 


V 


zyuo 


A 

u 






Zy I J 


44. 


1 09 


3 


Zyj J 


A 




6 




J 


16 


3 


2957 


1 


11 


12 


2959 


3 


27 


10 


2977 


16 


30 


2 


2980 


12 


27 


2 


3000 


2 


13 


7 


3009 


12 


29 


3 


3115 


0 


7 


8 


3156 


502 


2170 


5 
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SEQ ID NOs: 


Lib 1 clones 


Lib 2 clones 


lib 2/lib 1 


3210 


2 


21 


11 


3211 


0 


9 


10 


3213 


0 


7 


8 


3235 


2 


12 


6 


3251 


2 


12 


6 


3296 


3 


12 


4 


3335 


1 


8 


9 



EXAMPLE 8 

Polynucleotides Differentially Expressed in High Metastatic Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 

5 

A number of polynucleotide sequences have been identified that are 
differentially expressed between cells derived from high metastatic potential colon 
cancer tissue and normal tissue. Expression of these sequences in colon cancer tissue 
can provide diagnostic, prognostic and/or treatment information. For example, 

10 sequences that are highly expressed in the high metastatic potential cells can be 
indicative of increased expression of genes or regulatory sequences involved in the 
advanced disease state which involves processes such as angiogenesis, dedifferentiation, 
cell replication, and metastasis. A patient sample displaying an increased level of one 
or more of these polynucleotides may thus warrant more aggressive treatment. 

15 The differential expression of these polynucleotides can be used as a 

diagnostic marker, a prognostic marker, for risk assessment, patient treatment and the 
like. These polynucleotide sequences can also be used in combination with other 
known molecular and/or biochemical markers. 

The following tables summarize polynucleotides that are differentially 

20 expressed between high metastatic potential colon cancer tissue and normal colon 
tissue: 
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Table 12 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3 and) : Lower expression in high metastatic potential colon tissue 
(patient 2:lib 17; patient 3:lib 20) vs. normal colon tissue (patient 2:lib 15; patient 
5 3:libl8) 



SEQ ID NO: 


lib 1 5 clones 


Hh 1 7 clnnp<5 


lib 1 5/lih 1 7 


69 


19 


7 


J 


123 


6 




o 


140 


24 


» 

o 


-J 

J 


197 


6 


n 




198 


1 1 -J 




1Z1 


254 




Q 


-3 
J 


412 


78 


Q 


1 

J 


512 


1 1 


1 
1 


1 7 


641 


1 7 


7 




642 


7 




o 
o 


954 


1? 


■J 


A. 


1011 


209 


1 ft 


1 4 


1024 


g 


A 
\f 


Q 


1040 


12 






1055 


26 


7 


4. 


1 106 


31 


J5 


z. 


1125 


17 


0 


1 8 

1 O 


1129 


17 


o 


1 8 


1138 


109 


0 


117 


1244 


14 


1 


15 


1253 


73 


0 


78 


1283 


34 


7 


5 


1285 


34 


7 


5 


1339 


13 


4 


3 


1474 


73 


0 


78 


1505 


18 


3 


6 


1553 


68 


6 


12 


1554 


2542 


14 


195 


1605 


2542 


14 


195 


1628 


6 


0 


6 


1643 


142 


4 


38 


1753 


12 


0 


10 


1764 


13 


0 


14 



4?f 
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SEQ ID NO: 


lib 15 clones 


lib 1 7 clones 


nv> i 1 7 


SEQ ID NO: 


Lib 18 Clones 


Lib20 Clones 


liKI R/lil-iOft 


105 


28 


1 1 


z 


198 


21 


0 


1 8 


254 


Q 

7 


0 


8 


412 


9 


0 


, 8 


1011 


11 


1 


9 


1138 


14 


0 


12 


1253 


23 


0 


20 


1643 


18 


0 


15 


1764 


12 


0 


10 


3156 


140 


43 


3 



Table 13 

Differentially expressed polynucleotides isolated from samples from two patients 
(patient 2 and patient 3): Lower expression in normal colon tissue (patient 2:lib 15 
5 patient 3:lib 18)vs. high metastatic potential colon tissue (patient 2:lib 17; patient 3: 

20). 



SEQ ID NO: 


Lib 15 Clones 


Lib 17 Clones 


lib 17/lib 15 


321 


3 


23 


7 


363 


1 


9 


8 


836 


21 


99 


4 


859 


6 


20 


3 


885 


13 


28 


2 


916 


13 


28 


2 


981 


2 


11 


5 


1226 


8 


70 


8 


1308 


0 


8 


7 


1317 


29 


84 


3 


1429 


27 


127 


4 


1442 


0 


9 


8 1 


1534 


1 


12 


11 


1540 


12 


• 43 


3 


1552 


0 


7 


7 


1556 


1 


9 


8 


1557 


1 


9 


8 


1569 


2189 


5122 


2 


1571 


6 


18 


3 


1576 


3 


25 


8 
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cpn in Nn- 

O i-'*s<; 1 VJ IN \J . 


T ih 1 S Plnnp<! 


T ih 1 7 Clones 


lib 1 7/lib 1 5 


1 1 

1 Jo 1 


4 


22 


5 


1 OU 1 




1 S7 


6 


1 0 1 D 


Q 


48 

to 


5 


101 o 


1 < 

1 J 


61 


4 


i son 
lozu 


-> 
Z 


17 


8 
o 


lozz 




99 




loZo 


O 


3s 




104/ 






■J 


1664 


4 


09. 


7 


looi 


Z 


1 8 
1 o 


8 
o 


1704 


T 
J 


1 « 
1 j 


j 


1 O C\C\ 

1800 


U 


7 


7 


274V 


Zj 


OW 


z. 


2784 


4 


1 A 
1 4 


~i 


2805 


1 


Q 


8 
o 


2976 


J 


14 


A 


3128 


1 O 

18 


^7 




3129 


26 


1Z4 


/I 

T- 


3146 


64 


Z 1U 


-3 
J 


3150 


CsA f\ 

94U 


zzo / 


Z 


j i j i 


7 


1 s 


7 










SEQ ID NO: 


lib 18 clones 


lib 20 clones 


lib 20/lib 18 


865 


0 


5 


6 


1569 


1 


7 


8 


1580 


1 


7 


8 


1590 


1 


7 


8 


2790 


0 


5 


6 



EXAMPLE 9 

Polynucleotides Differentially Expressed in High Colon Tumor Potential 
Patient Tissue Versus Metastasized Colon Cancer Patient Tissue 
5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from colon cancer tissue and cells derived 
from colon cancer tissue metastases to liver. Expression of these sequences in colon 
cancer tissue can provide diagnostic, prognostic and/or treatment information associated 
with the transformation of precancerous tissue to malignant tissue. This information 
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can be useful in the prevention of achieving the advanced malignant state in these 
tissues, and can be important in risk assessment for a patient. 

The following table summarizes identified polynucleotides with 
differential expression between high tumor potential colon cancer tissue and cells 
5 derived from high metastatic potential colon cancer cells: 



Table 14 

Differentially expressed polynucleotides: 
Greater expression in metastatic colon tumor tissue (lib 20) vs. 
10 colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 19 clones 


lib 20 clones 


lib 20/lib 19 


937 


0 


6 


8 


976 


0 


5 


7 


1520 


1 


8 


11 


1546 


1 


11 


15 


1550 


1 


11 


15 


1574 


1 


8 


11 


1580 


0 


7 


9 


1590 


0 


7 


9 


1599 


8 


21 


4 


1607 


158 


632 


5 


1622 


1 


7 


9 



Table 15 

Greater expression in colon tumor tissue (lib 1 9) than metastatic colon tissue (lib 20) 



SEQ ID NO: 


lib 19 clones 


lib 20 clones. 


lib 19/lib 20 


105 


64 


11 


4 


1011 


53 


1 


40 


1226 


18 


4 


3 


1571 


8 


0 


6 


1726 


15 


3 


4 


1811 


17 


2 


6 


2749 


47 


6 


6 


3146 


19 


2 


7 


3324 


20 


1 


15 
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EXAMPLE 10 

Polynucleotides Differentially Expressed in High Tumor Potential 
Colon Cancer Patient Tissue Versus Normal Patient Tissue 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between cells derived from high tumor potential colon cancer 
tissue and normal tissue. Expression of these sequences in colon cancer tissue can 
provide diagnostic, prognostic and/or treatment information associated with the 
prevention of the malignant state in these tissues, and can be important in risk 

10 assessment for a patient. For example, sequences that are highly expressed in the 
potential colon cancer cells are associated with or can be indicative of increased 
expression of genes or regulatory sequences involved in early tumor progression. A 
patient sample displaying an increased level of one or more of these polynucleotides 
may thus warrant closer attention or more frequent screening procedures to catch the 

15 malignant state as early as possible. 

The following tables summarize polynucleotides that are differentially 
expressed between high metastatic potential colon cancer cells and normal colon cells: 

Table 16 

Differentially expressed polynucleotides detected in samples from patient (patient 2) 
20 Higher expression in normal colon tissue (patient 2, lib 1 5) 

vs. tumor potential colon tissue (patient 2:libl6) 



SEQ ID NO: 


lib 1 5 clones 


lib 16 clones 


lib 16/lib 15 


69 


19 


7 


3 


105 


116 


54 


2 


140 


24 


4 


6 


197 


6 


0 


6 


198 


113 


3 


40 


254 


28 


6 


5 


412 


28 


6 


5 


642 


7 


0 


7 


830 


10 


I 2 


5 


938 


31 


13 


3 


1011 


209 


37 


6 


1095 


12 


3 


4 


1125 


17 


0 


18 
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SFO ID NO- 


1 1 V\ 1 S rlnnpc 


1 i V\ 1 £ r» mnPQ 


lib 1 fi/lih 1 S 


1 190 


17 


n 

V 


1 8 


1 1 18 

1 1 JO 


i no 


1 

1 


1 1 5 


1 951 




1 

1 


77 




14 


1 1 


-J 
j 


1 98S 




1 1 


1 
j 


1 HQ 


1 1 


■J 


j 


1 4 JJ 


1 1 
1 1 


-} 
J 




1 zl"74 
14 /4 


/J 


1 
1 


77 


i <;n^ 


1 8 


O 




1 ^ S.A 






O 


1 




448 
44 S 


O 


1 C 1 yl 


JO 


1 4 


1 


IojU 




Q 


1 


1 AA1 
1 CrrJ 


1 49 


Z 


7S 

/J 


1646 


39 


14 


3 


1649 


24 


8 


3 


1677 


19 


6 


3 


1753 


13 


0 


14 


1764 


13 


0 


14 


1766 


177 


65 


3 


1772 


24 


8 


3 



Table 17 

Differentially expressed polypeptides detected in samples from patient. Lower 
expression in normal colon tissue (lib 18) than colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 1 8 clones 


lib 19 clones 


lib 19/lib 18 


3146 


3 


19 


6 


3150 


21 


228 


10 


3324 


3 


20 


6 



5 
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Table 18 

Differentially expressed polypeptides detected in samples from patient. Higher 
expression in normal colon tissue (lib 18) than colon tumor tissue (lib 19) 



SEQ ID NO: 


lib 1 8 clones 


lib 19 clones 


lib 18/lib 19 


198 


21 


2 


12 


465 


6 


0 


7 


489 


6 


0 


7 


745 


6 


0 


7 


859 


11 


2 


6 


976 


7 


0 


8 


1011 


209 


37 


6 


1045 


8 


1 


9 


1138 


14 


0 


16 


1253 


23 


0 


26 


1392 


16 


4 


5 


1474 


23 


0 


26 


1589 


6 


0 


7 


1591 


22 


11 


2 


1607 


386 


158 


3 


1643 


18 


0 


21 


1753 


12 


0 


14 


1764 


12 


0 


14 










SEQ ID NO: 


lib 1 8 clones 


lib 19 clones 


lib 19/lib 18 


105 


28 


64 


2 


1011 


11 


53 


4 


1226 


2 


18 


8 


1251 


6 


19 


3 


1559 


1 


9 


8 


1571 


0 


8 


7 


1608 


1 


9 


8 


1766 


2 


13 


6 


1782 


1 


9 


8 


1811 


1 


17 


15 



5oD 
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Table 19 

Differentially expressed polynucleotides: 
Higher expression in colon tumor tissue 
(patient 2, lib 16) vs. normal colon tissue (patient 2, lib 15) 



SEQ ID NO: 


lib 1 5 clones 


lib 16 clones 


lib 16/lib 15 


7 


1 


9 


9 


164 


6 


19 


3 


734 


4 


15 


4 


836 


21 


53 


2 


928 


2 


11 


5 


965 


2 


11 


5 


987 


2 


11 


5 


1026 


7 


19 


3 


1044 


4 


16 


4 


1119 


4 


16 


4 


1226 


8 


46 


I 5 


1227 


0 


9 


9 


1251 


7 


95 


13 


1316 


0 


6 


6 


1429 


27 


81 


3 


1442 


0 


9 


9 


1540 


12 


28 


2 


1553 


68 


590 


8 


1560 


4 


24 


6 


1577 


1 


10 


9 


1588 


5 


20 


4 


1610 


3 


13 


4 


1620 


2 


23 


11 


1626 


6 


23 


4 


1673 


2 


15 


7 


2416 


0 


7 


7 


2749 


23 


54 


2 


2976 


3 


14 


4 


3129 


26 


64 


2 


3132 


18 


54 


3 
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EXAMPLE 1 1 

Polynucleotides Differentially Expressed in Growth Factor-Stimulated 
Human Microvascular Endothelial Cells (HMEC) Relative to Untreated 

HMEC 

5 A number of polynucleotide sequences have been identified that are 

differentially expressed between human microvascular endothelial cells (HMEC) that 
have been treated with growth factors relative to untreated HMEC. 

Sequences that are differentially expressed between growth factor-treated 
HMEC and untreated HMEC can represent sequences encoding gene products involved 

10 in angiogenesis, metastasis (cell migration), and other developmental and oncogenic 
processes. For example, sequences that are more highly expressed in HMEC treated 
with growth factors (such as bFGF or VEGF) relative to untreated HMEC can serve as 
markers of cancer cells of higher metastatic potential. Detection of expression of these 
sequences in colon cancer tissue can provide diagnostic, prognostic and/or treatment 

15 information associated with the prevention of achieving the malignant state in these 
tissues, and can be important in risk assessment for a patient. A patient sample 
displaying an increased level of one or more of these polynucleotides may thus warrant 
closer attention or more frequent screening procedures to catch the malignant state as 
early as possible. 

20 The following table summarizes identified polynucleotides with 

differential expression between growth factor-treated and untreated HMEC. 



Table 20 

Differentially expressed polynucleotides: 
25 Higher expression in untreated HMEC (lib 12) vs. bFGF treated HMEC (lib 13) 



SEQ ID NO: 


lib 12 clones 


lib 13 clones 


lib 12/lib 13 


849 


6 


0 


6 


1059 


6 


0 


6 


1206 


12 


2 


6 


3208 


12 


0 


12 
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Lower expression in untreated HMEC (lib 12) vs. bFGF treated HMEC (lib 13) 



2748 


3 


12 


4 


3325 


0 


6 


6 



Table 21 

Differentially expressed polynucleotides: 
Higher expression in untreated HMEC (lib 12) VEGF treated HMEC (lib 14) 



SEQ ID NO: 


lib 12 clones 


lib 14 clones 


lib 12/lib 14 


1150 


9 


0 


9 



Lower expression in untreated HMEC (lib 12) vs. VEGF treated HMEC (lib!4) 



3324 



22 



50 



10 



15 



EXAMPLE 12 

Polynucleotides Differentially Expressed iN Normal Prostate Cells 
Relative to Prostate Cancer Cells 
A number of polynucleotide sequences have been identified that are 
differentially expressed between cells derived from normal prostate cells and prostate 
cancer cells. Expression of these sequences prostate tissue suspected of being 
cancerous can provide diagnostic, prognostic and/or treatment information. These 
polynucleotide sequences can also be used in combination with other known molecular 
and/or biochemical markers. The following table summarizes identified 
polynucleotides with differential expression between high metastatic potential colon 
cancer cells and low metastatic potential colon cancer cells: 



20 
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Table 22 

Differentially expressed polynucleotides: normal prostate cell line (lib 21) 
vs. prostate cancer cell line (lib 22) 
Higher in lib 21 



SEQ ID NO: 


lib 21 clones 


lib 22 clones 


lib 21 /lib 22 


53 


17 


2 


8 


1754 


22 


8 


3 


1801 


7 


0 


7 


1845 


22 


6 


4 


446 


8 


0 


8 


1410 


6 


0 


6 


2060 


18 


6 


3 


2143 


12 


3 


4 


2632 


13 


1 


13 


2899 


16 


2 


8 


3338 


12 


2 


6 



5 

Higher in lib 22 



86 


2 


13 


7 


93 


0 


9 


9 


687 


0 


9 


9 


1269 


1 


15 


15 


1581 


25 


74 


3 


1647 


25 


74 


3 


1649 


12 


27 


2 


1710 


5 


16 


3 


1717 


5 


16 


3 


1772 


12 


27 


2 


1960 


0 


6 


6 


2987 


0 


6 


6 


3128 


13 


42 


3 


3132 


13 


42 


3 


3150 


263 


962 


4 


3222 


0 


6 


6 


3268 


0 


6 


6 



Bo«f 
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EXAMPLE 13 

Polynucleotides Differentially Expressed Across Multiple Libraries 

A number of polynucleotide sequences have been identified that are 
differentially expressed between cancerous cells and normal cells across two or more 
5 tissue types tested (i.e., breast, colon, lung, and prostate). Expression of these 
sequences in a tissue of any origin can provide diagnostic, prognostic and/or treatment 
information associated with the prevention of achieving the malignant state in these 
tissues, and can be important in risk assessment for a patient. These polynucleotides 
can also serve as non-tissue specific markers of, for example, risk of metastasis of a 

10 tumor. The following polynucleotides were differentially expressed but without tissue 
type-specificity in at least two of the breast, colon, lung, and prostate libraries tested: 
53, 105, 355, 412, 614, 836, 1442, 1581, 1647, 1649, 1664, 1772, 1782, 1811, 1845, 
1856, 1875, 1923, 2060, 2071, 2135, 2146, 2239, 2313, 2378, 2393, 2416, 2460, 2490, 
2632, 2674, 2704, 2724, 2749, 2784, 2804, 2959, 2976, 2977, 2980, 2987, 3009, 3047, 

15 3128, 3129, 3132, 3146, 3150, 3156, 3210, 3324, 3331, and 3335. 

Those skilled in the art will recognize, or be able to ascertain, using not 
more than routine experimentation, many equivalents to the specific embodiments of 
the invention described herein. Such specific embodiments and equivalents are 
intended to be encompassed by the following claims. 

20 All publications and patent applications cited in this specification are 

herein incorporated by reference as if each individual publication or patent application 
were specifically and individually indicated to be incorporated by reference. The 
citation of any publication is for its disclosure prior to the filing date and should not be 
construed as an admission that the present invention is not entitled to antedate such 

25 publication by virtue of prior invention. 

Although the foregoing invention has been described in some detail by 
way of illustration and example for purposes of clarity of understanding, it is readily 
apparent to those of ordinary skill in the art in light of the teachings of this invention 
that certain changes and modifications may be made thereto without departing from the 

30 spirit or scope of the appended claims. 

Deposit Information: 

The following materials were deposited with the American Type Culture 
Collection (ATCC); CMCC = Chiron Master Culture Collection: 
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cDNA Libraries Deposited with ATCC 







ATCC 


CMCC 


Tube Number 


Deposit Date 


Accession No. 


Accession No. 


ESI 37 


May 30, 2000 






ESI 38 


May 30, 2000 






ESI 39 


May 30, 2000 






ESI 40 


May 30, 2000 






ES141 


May 30, 2000 






ES142 


May 30, 2000 






ESI 43 


May 30, 2000 






ES144 


May 30, 2000 






ESI 45 


May 30, 2000 






ESI 46 


May 30, 2000 






ESI 47 


May 30, 2000 






ESI 48 


May 30, 2000 






ESI 49 


May 30, 2000 






ESI 50 


May 30, 2000 






ES151 


May 30, 2000 






ESI 52 


May 30, 2000 






ESI 53 


May 30, 2000 






ES154 


May 30, 2000 






ESI 55 


May 30, 2000 






ESI 56 


May 30, 2000 






ESI 57 


May 30, 2000 






ES158 


May 30, 2000 






ES159 


May 30, 2000 






ESI 60 


May 30, 2000 






ES161 


May 30, 2000 






ESI 62 


May 30, 2000 






ESI 63 


May 30, 2000 






ESI 64 


May 30, 2000 






ESI 65 


May 30, 2000 






ESI 66 


May 30, 2000 






ESI 67 


May 30, 2000 







Table 23 lists the clones for each deposit, designated as "tube" number. 
5 This deposit is provided merely as convenience to those of skill in the art, and is not an 
admission that a deposit is required under 35 U.S.C. §112. The sequence of the 
polynucleotides contained within the deposited material, as well as the amino acid 
sequence of the polypeptides encoded thereby, are incorporated herein by reference and 
are controlling in the event of any conflict with the written description of sequences 
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herein. A license may be required to make, use, or sell the deposited material, and no 
such license is granted hereby. 



Retrieval of Individual Clones from Deposit of Pooled Clones 

Where the ATCC deposit is composed of a pool of cDNA clones, the 
5 deposit was prepared by first transfecting each of the clones into separate bacterial cells. 
The clones were then deposited as a pool of equal mixtures in the composite deposit. 
Particular clones can be obtained from the composite deposit using methods well 
known in the art. For example, a bacterial cell containing a particular clone can be 
identified by isolating single colonies, and identifying colonies containing the specific 

10 clone through standard colony hybridization techniques, using an oligonucleotide probe 
or probes designed to specifically hybridize to a sequence of the clone insert (e.g., a 
probe based upon unmasked sequence of the encoded polynucleotide having the 
indicated SEQ ID NO). The probe should be designed to have a T m of approximately 
80°C (assuming 2°C for each A or T and 4°C for each G or C). Positive colonies can 

15 then be picked, grown in culture, and the recombinant clone isolated. Alternatively, 
probes designed in this manner can be used to PCR to isolate a nucleic acid molecule 
from the pooled clones according to methods well known in the art, e.g., by purifying 
the cDNA from the deposited culture pool, and using the probes in PCR reactions to 
produce an amplified product having the corresponding desired polynucleotide 

20 sequence. 

Table 23 







M00001351A:B02 


ES 137 


M00001356A:H11 


ES 137 


M0O0OI363D:DO9 


ES 137 


M00001395D:H02 


ES 137 


M00001439C:H06 


ES 137 


M00001476B:G10 


ES 137 


M00001582A:E02 


ES 137 


M00003750D:E06 


ES 137 


M00003761C:F02 


ES 137 


M00003770A:E05 


ES 137 


M00003786A:A1 1 


ES 137 


M00003800A:F09 


ES 137 


M00003816D:E1 1 


ES 137 


M00003902A:C03 


ES 137 


M00003991C:F06 


ES 137 



^"Clone-Name. 




M00003995B:E03 


ES 137 


M00004046C:A08 


ES 137 


M000041O5D:D05 


ES 137 


1vi00004 139B:B 10 


ES 137 


M00004140D:C03 


ES 137 


M00004144A:H05 


ES 137 


M00004152A:C12 


ES 137 


M00004155D:A10 


ES 137 


M00004168A:G1 1 


ES 137 


M00004197B:H10 


ES 137 


M00004222C:E03 


ES 137 


M00004234A:E07 


ES 137 


M00004239B:F11 


ES 137 


M00004241B:H07 


ES 137 


M00004264B:A05 


ES 137 
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^vCloifeNanre'; ■% 




M00004278A:F09 


ES 137 


M00004282D:C1 1 


ES 137 


M00004308C:C06 


ES 137 


M00004340C:C07 


ES 137 


M00004354D:E05 


ES 137 


M00004361A:H02 


ES 137 


M00004372B:F07 


ES 137 


M00004378A:B10 


ES 137 


M00004393B:E07 


ES 137 


M00023282A:C02 


ES 137 


M00023300D:C1 1 


ES 137 


M00023316C:G08 


ES 137 


MOO023333D:C12 


ES 137 


M00023352B:F03 


ES 137 


M00023352D:H03 


ES 137 


M00023376BG04 


ES 137 


M00023377BF01 


ES 137 


M00023398B-D12 


ES 137 


M00023399CE10 


ES 137 


M00026803AF08 


ES 137 


M00026843BD10 


ES 137 


M00026850DF09 


ES 137 


M00026851BF01 


ES 137 


M00026856D:F02 


ES 137 


M00026857D:G12 


ES 137 


M00026859DD0 1 


ES 137 


M00026860B:C05 


ES 137 


M00026865B:A06 


ES 137 


M00026868C:E1 1 


ES 137 


M00026878A:F05 


ES 137 


M00026882D:G09 


ES 137 


M00026885A:H09 


ES 137 


M00026901A:G07 


ES 137 


M00026914A:H10 
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CLAIMS 



We claim: 



1 . A library of polynucleotides, the library comprising the sequence 
information of at least one of SEQ ID NO: 1 -335 1 . 

2. The library of claim 1, wherein the library is provided on a nucleic 



acid array. 



3. 



The library of claim 1, wherein the library is provided in a 



computer-readable format. 



4. 



The library of claim 1, wherein the library comprises a 



polynucleotide corresponding to a gene differentially expressed in a cancer cell of high 
metastatic potential relative to a control cell, wherein the control cell is a normal cell or a 
cell of low metastatic potential, wherein the expression is greater in the metastatic tissue, 
and wherein the sequence is selected from the group consisting of SEQ ID NOs:14, 137, 



151, 152, 171,200,254, 262,271,348,412,472, 507, 520, 530, 588, 623, 637, 660, 678, 
680, 700, 7 1 4, 774, 8 1 2, 834, 90 1 , 937, 976, 1168,1333,1352, 1 520, 1 524, 1 546, 1 550, 
1574, 1580, 1590, 1599, 1607, 1622, 1706, 1752, 1768, 1769, 1780, 1781, 1799, 1803, 
1811, 1851, 1856, 1867, 1872, 1875, 1884, 1919, 1923, 1939, 1975, 2024, 2045,2060, 
2071, 21 18, 21 19, 2128, 2135, 2177, 2181, 2184, 2185, 2190, 2193, 2232, 2239, 2283, 
231 1, 2314, 2338, 2378, 2393, 2394, 2395, 2398, 2460, 2490, 2505, 2514, 2540, 2542, 
2597, 2607, 2640, 2657, 2669, 2670, 2674, 2679, 2684, 2707, 2724, 2757, 2776, 2804, 
2818, 2906, 2959, 2964, 2968, 2976, 2980, 2987, 3010, 3043, 3047, 3050, 3071, 3072, 
3092, 3095, 3097, 3140, 3157, 3173, 3187, 3203, 3210, 3212, 3220, 3236, 3249, 3264, 
3284, 3288, 3305, 3309, 3318, 3330, 3331, and 3335. 



polynucleotide corresponding to a gene differentially expressed in normal colon tissue 
relative to colon cancer tissue, wherein the expression is greater in the cancer tissue, and 
wherein the sequence is selected from the group consisting of SEQ ID "NOs:7, 164, 734, 
836,928,965,987, 1026, 1044, 1119, 1226, 1227, 1251, 1316, 1429, 1442, 1540, 1553, 
1560, 1577, 1588, 1610, 1620, 1626, 1673, 2416, 2749, 2976, 3129 and 3132. 



5. 



The library of claim 1, wherein the library comprises a 
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6. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal colon tissue 
relative to colon cancer tissue, wherein the expression is greater in normal tissue than 
cancer tissue, and wherein the sequence is selected from the group consisting of SEQ ID 
NOs:105, 198,465,489, 745, 859,976, 1011, 1045, 1138, 1226, 1251, 1253, 1392, 1474, 
1559, 1571, 1589, 1591, 1607, 1608, 1643, 1753, 1764, 1766, 1782, 1811,2749,2784, 
2790, 2805, 2976, 3128, 3129, 3146, 3150, and 3151. 

7. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal human 
prostate cells relative to human prostate cancer cells, wherein the expression is greater 
in normal cells than cancer cells, and wherein the sequence is selected from the group 
consisting of SEQ ID NOs:53, 446, 1410, 1754, 1801, 1845, 2060,2143,2632,2899, 
and 3338. 

8. The library of claim 1 , wherein the library comprises a 
polynucleotide corresponding to a gene differentially expressed in normal human 
prostate cells relative to human prostate cancer cells, wherein the expression is greater 
in cancer cells than normal cells, and wherein the sequence is selected from the group 
consisting of SEQ ID NOs:86, 93, 687, 1269, 1581, 1647, 1649, 1710, 1717, 1772, 
1960, 2987, 3128, 3132, 3150, 3222, and 3268. 

9. An isolated polynucleotide comprising a nucleotide sequence 
having at least 90% sequence identity to an identifying sequence of SEQ ID NOs: 1-3351 or 
a degenerate variant or fragment thereof. 

10. A recombinant host cell containing the polynucleotide of claim 9. 

11. An isolated polypeptide encoded by the polynucleotide of claim 9. 

1 2. An antibody that specifically binds a polypeptide of claim 1 1 . 

1 3. A vector comprising the polynucleotide of claim 9. 

14. A method of detecting differentially expressed genes correlated 
with a cancerous state of a mammalian cell, the method comprising the step of: 

detecting at least one differentially expressed gene product in a test sample 
derived from a cell suspected of being cancerous, wherein the gene product is encoded by a 
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gene corresponding to a sequence of at least one ofSEQ ID NOs: 14, 137, 151, 152, 171, 
200, 254, 262, 271, 348, 412, 472, 507, 520, 530, 588, 623, 637, 660, 678, 680, 700, 714, 
774, 812, 834, 901, 937, 976, 1 168, 1333, 1352, 1520, 1524, 1546, 1550, 1574, 1580, 
1590, 1599, 1607, 1622, 1706, 1752, 1768, 1769, 1780, 1781, 1799, 1803, 1811, 1851, 
1856, 1867, 1872, 1875, 1884, 1919, 1923, 1939, 1975, 2024, 2045, 2060, 2071,2118, 
21 19, 2128, 2135, 2177, 2181, 2184, 2185, 2190, 2193, 2232, 2239, 2283, 231 1, 2314, 
2338, 2378, 2393, 2394, 2395, 2398, 2460, 2490, 2505, 2514, 2540, 2542, 2597, 2607, 
2640, 2657, 2669, 2670, 2674, 2679, 2684, 2707, 2724, 2757, 2776, 2804, 2818, 2906, 
2959, 2964, 2968, 2976, 2980, 2987, 3010, 3043, 3047, 3050, 3071, 3072, 3092, 3095, 
3097, 3140, 3157, 3173, 3187, 3203, 3210, 3212, 3220, 3236, 3249, 3264, 3284, 3288, 
3305, 3309, 3318, 3330, 3331, and 3335. 

wherein detection of the differentially expressed gene product is correlated with 
a cancerous state of the cell from which the test sample was derived. 

15. A method of detecting differentially expressed genes correlated 
with a cancerous state of a mammalian cell, the method comprising the step of: 

detecting at least one differentially expressed gene product in a test 
sample derived from a cell suspected of being cancerous, wherein the gene product is 
encoded by a gene corresponding to a sequence of at least one of SEQ ID NOs: 7, 164, 
734, 836,928,965, 987, 1026, 1044, 1119, 1226, 1227, 1251, 1316, 1429, 1442, 1540, 
1553, 1560, 1577, 1588, 1610, 1620, 1626, 1673, 1960,2416,2749, 2976, 2987,3128, 
3129, 3132, 3150, 3222, and 3268. 

wherein detection of the differentially expressed gene product is correlated with 
a cancerous state of the cell from which the test sample was derived. 
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